Linghui Luo is seen standing outside in front of a rock formation, with a small waterfall in the background
Linghui Luo, an AWS applied scientist based in Berlin, is conducting research into quicker, easier methods for ensuring code is stable and secure.

From internship project to published research and a role at Amazon

How Linghui Luo's research helps ensure code is checked and ready to deploy.

Building quality software tends to follow a familiar routine for most developers. You write code on your computer within an integrated development environment (IDE), and then, to check for any security flaws, you upload it to a central repository and run a security scan. The results appear on a dashboard in your web browser, separate from the IDE.

Linghui Luo was asked to rethink this workflow during a five-month internship at Amazon Web Services (AWS) in 2020. In doing so, she came up with a prototype for a novel way to run security scans on code. The prototype became the basis for a 2021 research paper and evolved into the newly launched Amazon CodeGuru Security plugin for two IDEs, Amazon SageMaker Studio and Jupyter notebooks.

See Amazon's Berlin research office
The customer-obsessed science produced by teams in Berlin is integrated in several Amazon products and services, including retail, Alexa, robotics, and more.

Luo joined Amazon full-time in early 2022 as an AWS applied scientist, shortly after earning her PhD in computer science at the Heinz Nixdorf Institute at Paderborn University in Germany. Now based in Berlin, she has continued her research into quicker, easier methods for ensuring code is stable and secure. The first line of her GitHub biography page says it best: “The usage of security analysis tools should become an industrial convention in secure software development. However, we need to create usable analysis tools first.”

Streamlining security scans

Luo's work makes it easier for developers to use Amazon CodeGuru Security, a tool that can identify critical issues, security vulnerabilities, and hard-to-find bugs. CodeGuru Security is a static analysis tool, which means it evaluates each line of code without running it, offering an opportunity to head off problems as work progresses.

But she doesn't just focus on the software — she also studies the developers who use it. The results affirm a key Amazon practice: working backwards from the customer.

CodeGuru Security operates in the cloud, which is ideal for static analysis tools — particularly ones that perform the kind of deep analysis that security testing requires. In the cloud, users can track and store issues in a central location, and each scan runs more efficiently than it would on a single machine.

Related content
Based on a survey of thousands of machine learning practitioners, a new CodeGuru extension addresses common problems, such as code cell execution order, incorrect API calls, and security.

When developers use popular continuous integration workflows, they receive security recommendations every time they push code. The recommendations appear in the developer’s web browser.

What if developers could have a direct line to CodeGuru Security, running static analysis in the cloud from within the IDE? This was the challenge AWS applied scientist Martin Schäf presented to Luo for her internship.

"At the beginning, most people would think this is a software engineering problem, but it's actually not," Luo said. "What we took was basically a user-centric approach."

Starting with the user

Luo first interviewed AWS developers to determine what they expected from an IDE-based static analysis tool. When should the analysis happen? How automated should it be? How long did they think it should take?

The problem may not be as straightforward as it sounds. While some tools already do static analysis from within an IDE, it is typically "lightweight" scanning that catches glaring problems and takes maybe 10 seconds at most to complete. Static application security testing, on the other hand, looks more intensively at the code. That takes several minutes, even with cloud resources — in the past, such testing was much slower, taking hours. A successful integration would need to manage user expectations on timing, among other aspects.

Related content
Prioritizing predictability over efficiency, adapting data partitioning to traffic, and continuous verification are a few of the principles that help ensure stability, availability, and efficiency.

Based on her interviews with developers, Luo developed a prototype CodeGuru Security extension for Visual Studio, a popular IDE. Then she ran usability tests to see whether what she built matched developers' needs.

The project, Luo said, expanded her horizons in understanding how to build more useful tools for developers. Actions that may have seemed trivial to her, like needing to take code out of the IDE and upload it somewhere else for analysis, proved to be pain points for developers who wanted a static analysis integration to be as seamless as possible.

"As a PhD student who has always been at university, I had some assumptions about what developers would like to have," Luo said. "But after talking to them, I found out that what they want is totally different." The experience reinforced to her the importance of talking to users before you develop a tool.

Validating code from notebooks

The new CodeGuru plugin for Jupyter and SageMaker Studio is meant to help users prevent bugs from sneaking into code developed in notebooks. Data scientists like notebooks because they can append text and relevant images to lines of code.

But the platform can lend itself to reproducibility issues. Let's say you have four lines of code, each in a different code cell within a notebook. A user can run the code cells in arbitrary order; but when the code is shared, another user might run them in a different sequence. That’s an issue, because running code cells in a different order might produce different results. Luo offers the example in a recent paper about the issue co-authored with Amazon colleagues Schäf, Ben Liblit, Alejandro Molina Ramirez, Rajdeep Mukherjee, Goran Piskachev, Omer Tripp, and Willem Visser; along with Zachary Patterson of the University of Texas at Dallas.

Left: code cells executed in nonlinear order; right: code cells executed in linear order.
Left: code cells executed in nonlinear order; right: code cells executed in linear order.

Notebooks are great for data exploration and presentation, Luo explained, but too often, the code gets passed on and deployed without being checked. "If you cannot reproduce the result, how can you ensure that your code is running correctly?" Luo said. The CodeGuru plugin can flag such potential flaws and suggest improvements.

Of course, a security recommendation is only truly useful if the developer actually deploys it. Ongoing research on Luo's team explores how to gauge the quality of static analysis rules by measuring certain developer actions.

Visible impact

Luo developed an interest in computers as a high school student in China. It was a "natural choice," she said, to go right into computer science for college. Her interest in computer security emerged from a personal experience while she was a master's student. She noticed that an app she was using allowed a user to change the cell phone number attached to an account without any verification. The app was connected to her bank, and she was appalled at how insecure it was. That realization led to her focus on software security during her doctoral program.

My team at Amazon is a good platform for me to be able to put science into production and have a visible impact in a short time.
Linghui Luo

Luo's initiative during her Amazon internship — and the openness of her team — made it possible to make the most of her time there. By the time her internship was done, she already had an offer to join the team full-time. Schäf, Luo’s hiring manager, noted that Luo owned the science work on the SageMaker plugin from start to finish.

“At Amazon, we are customer obsessed, which is why it is so important to have scientists like her that follow a good scientific process to help our engineers understand which solutions bring the best value to our customer,” he said. “She quickly turns ideas into prototypes that allow us to verify what benefits our customers and what doesn’t.”

Luo had considered staying in academia after earning her doctoral degree, and at one point she also received an offer to join a research institution in Germany as tenure-track faculty. But ultimately, she decided Amazon was the place for her.

"It was a really hard decision," she said. "But I always wanted to do more applicable science. My team at Amazon is a good platform for me to be able to put science into production and have a visible impact in a short time."

Related content

US, NY, New York
AWS AI is looking for passionate, talented, and inventive Applied Scientists with a strong machine learning background to help build industry-leading Conversational AI Systems. Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Natural Language Understanding (NLU), Dialog Systems including Generative AI with Large Language Models (LLMs) and Applied Machine Learning (ML). As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services that make use language technology. You will gain hands on experience with Amazon’s heterogeneous text, structured data sources, and large-scale computing resources to accelerate advances in language understanding. We are hiring in all areas of human language technology: NLU, Dialog Management, Conversational AI, LLMs and Generative AI. About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.
US, WA, Seattle
An information-rich and accurate product catalog is a strategic asset for Amazon. It powers unrivaled product discovery, informs customer buying decisions, offers a large selection, and positions Amazon as the first stop for shopping online. We use data analysis and statistical and machine learning techniques to proactively identify relationships between products within the Amazon product catalog. This problem is challenging due to sheer scale (billions of products in the catalog), diversity (products ranging from electronics to groceries to instant video across multiple languages) and multitude of input sources (millions of sellers contributing product data with different quality). Amazon’s Item and Relationship Identity Systems group is looking for an innovative and customer-focused applied scientist to help us make the world’s best product catalog even better. We believe that failure and innovation are inseparable twins. In this role, you will partner with technology and business leaders to build new state-of-the-art algorithms, models, and services to infer product-to-product relationships that matter to our customers. You will work in a collaborative environment where you can experiment with massive data from the world’s largest product catalog, work on challenging problems, quickly implement and deploy your algorithmic ideas at scale, understand whether they succeed via statistically relevant experiments across millions of customers. Key job responsibilities * Map business requirements and customer needs to a scientific problem. * Align the research direction to business requirements and make the right judgments on research/development schedule and prioritization. * Research, design and implement scalable machine learning (ML), natural language, or computational models to solve problems that matter to our customers in an iterative fashion. * Mentor and develop junior applied scientists and developers who work on data science problems in the same organization. * Stay informed on the latest machine learning, natural language and/or artificial intelligence trends and make presentations to the larger engineering and applied science communities.
US, CA, San Diego
Are you passionate about automation, knowledge extraction, and artificial intelligence through the use of Machine Learning, Natural Language Processing, Recommender systems, Computer Vision, and Optimization? We have a team of experienced scientists with a critical business mission making revolutionary leaps forward in these spaces. On this team you will work with an immense and diverse corpus of text, image, and audio to build generative and discriminative models, analyze and model customer reading behavior to measure engagement and detect risks, study and optimize manufacturing and fulfillment processes, and build AI-based systems for helping indie authors with marketing their books. This will involve combining methods from several science domains with domain knowledge across multiple businesses into sophisticated ML workflows. Our team has mature areas and green-field opportunities. We offer scientific autonomy, value end-to-end ownership, and have a strong customer-focused culture. Come join us as we revolutionize the book industry and deliver an amazing experience to our Kindle authors and readers. Key job responsibilities As a Machine Learning Scientist at Amazon, you will connect with world leaders in your field working on similar problems. You will be working with large distributed systems of data and providing technical leadership to the product managers, teams, and organizations building machine learning solutions. You will be tackling Machine Learning challenges in Supervised, Unsupervised, and Semi-supervised Learning; utilizing modern methods such as deep learning and classical methods from statistical learning theory, detection, estimation. MLS’s are specialists with the knowledge to help drive the scientific vision for our products. They are externally aware of the state-of-the-art in their respective field of expertise and are constantly focused on advancing that state-of-the-art for improving Amazon’s products and services. Great candidates for this position will have experience in the areas of data science, machine learning, NLP, optimization, computer vision, or statistics. You will have hands-on experience with multiple science initiatives as well as be able to balance technical strength with business judgment to make decisions about technology, models and methodological choices. You will strive for simplicity, and demonstrate significant creativity and high judgment. About the team Kindle Direct Publishing (KDP) and Print-On-Demand (POD) have empowered a new wave of self-motivated creators, tearing down barriers that once blocked writers from reaching readers. Our team builds rich applications that empower anyone to realize their dream of becoming an author. We strive to provide an experience that is powerful, simple, and accessible to all. We build tools that enable authors to design high quality digital and print books, reaching readers all around the world. This role will help ensure we maintain the trust of both our Authors and Readers by ensuring all books published to Amazon meet our standards.
US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong deep learning background, to help build industry-leading technology with multimodal systems. Key job responsibilities As an Applied Scientist with the AGI team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with multimodal systems. Your work will directly impact our customers in the form of products and services that make use of vision and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate development with multimodal Large Language Models (LLMs) and Generative Artificial Intelligence (GenAI) in Computer Vision. About the team The AGI team has a mission to push the envelope with multimodal LLMs and GenAI in Computer Vision, in order to provide the best-possible experience for our customers.
US, WA, Bellevue
Do you want to work on a team where you are encouraged to build and have the autonomy to push boundaries? Invention has become second nature at Amazon, and the pace of innovation is only accelerating with breadth of our businesses expanding. Amazon’s growth requires leaders who move fast, have an entrepreneurial spirit to create new products, have an unrelenting tenacity to get things done, and are capable of breaking down and solving complex problems. The AIM, Planning team within SCOT comprises of S&OP, Inventory Prediction and Entitlement and Long-Term Capacity and Topology Planning. The team's charter is broad and complex and aimed at optimizing the utilization of fulfillment facilities and resources by accurately predicting demand and inventory efficiency measures while reducing stockouts and excess inventory costs across planning horizons, from short-term (within 13 weeks) to the long-term (13 weeks to 5 years). The team's north star is to be the reliable, single source of truth for inventory units and cube demand at granularities ranging from an FC’s bins to overall network level, and across planning horizons as close as next week to as far out as 3-5 years. To get there, we enhance or re-develop models and mechanisms where existing ones fail to account for structural shifts in supply chains, buying programs, or customer behaviors. We create new systems where science-based recommendations are currently lacking and being replaced by heuristics and offline human goal-seeking approaches. We strive to completely eliminate non-scientific interventions in our forecast guidance and capacity recommendations, and replace them with a system-driven outlook to uncover underlying root causes when departing from SCOT plans and recommendations. We institute authoritative and economics-based framework missing today to drive inventory efficiency measures for Retail buying programs (short/long-lead buys) and FBA plans that solve for capacity constraints in the most economical manner across horizons. This is a unique, high visibility opportunity for a senior science leader someone who wants to have business impact, dive deep into large-scale economic problems, enable measurable actions on the Consumer economy, and work closely with product managers, engineers, other scientists and economists. We are a Day 1 team, with a charter to be disruptive through the use of ML and bridge the Science and Engineering gaps that exist today. A day in the life In this pivotal role, you will be a technical leader in operations research or machine learning, with significant scope, impact, and visibility. Your solutions have the potential to drive billions of dollars in impact for Amazon's supply chain globally. As a senior scientist manager on the team, you will engage in every facet of the process—from idea generation, business analysis and scientific research to development and deployment of advanced models—granting you a profound sense of ownership. From day one, you will collaborate with experienced scientists, engineers, and product managers who are passionate about their work. Moreover, you will collaborate with Amazon's broader decision and research science community, enriching your perspective and mentoring fellow engineers and scientists. The successful candidate will have the strong expertise in applying operations research methodologies to address a wide variety of supply chain problems. You will strive for simplicity, demonstrate judgment backed by mathematical rigor, as you continually seek opportunities to innovate, build, and deliver. Entrepreneurial spirit, adaptability to diverse roles, and agility in a fast-paced, high-energy, highly collaborative environment are essential.
US, WA, Bellevue
We are a part of Amazon Alexa organization where our mission is “delight customers through contextual and personalized proactive experiences that keep customers informed, engaged, and productive without cognitive burden”. We are developing advanced systems to deliver engaging, intuitive, and adaptive content recommendations across all Amazon surfaces. We aim to facilitate seamless reasoning and customer experiences, surpassing the capabilities of previous machine learning models. We are looking for a passionate, talented, and resourceful Senior Applied Scientist in the field of Natural Language Processing (NLP), Large Language Model (LLM), Recommender Systems and/or Information Retrieval, to invent and build scalable solutions for a state-of-the-art context-aware personal assistant. A successful candidate will have strong machine learning background and a desire to push the envelope in one or more of the above areas. The ideal candidate would also enjoy operating in dynamic environments, be self-motivated to take on challenging problems to deliver big customer impact, shipping solutions via rapid experimentation and then iterating on user feedback and interactions. Key job responsibilities As a Senior Applied Scientist, you will leverage your technical expertise and experience to demonstrate leadership in tackling large complex problems, setting the direction and collaborating with applied scientists and engineers to develop novel algorithms and modeling techniques to enable timely, relevant and delightful recommendations and conversations. Your work will directly impact our customers in the form of products and services that make use of various machine learing, deep learning and language model technologies. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in the state of art.
US, WA, Seattle
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by preventing eCommerce fraud? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you enjoy collaborating in a diverse team environment? If yes, then you may be a great fit to join the Amazon Buyer Risk Prevention (BRP) Machine Learning group. We are looking for a talented scientist who is passionate to build advanced algorithmic systems that help manage safety of millions of transactions every day. Key job responsibilities Use machine learning and statistical techniques to create scalable risk management systems Learning and understanding large amounts of Amazon’s historical business data for specific instances of risk or broader risk trends Design, development and evaluation of highly innovative models for risk management Working closely with software engineering teams to drive real-time model implementations and new feature creations Working closely with operations staff to optimize risk management operations, Establishing scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation Tracking general business activity and providing clear, compelling management reporting on a regular basis Research and implement novel machine learning and statistical approaches
US, WA, Seattle
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by preventing eCommerce fraud? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you enjoy collaborating in a diverse team environment? If yes, then you may be a great fit to join the Amazon Buyer Risk Prevention (BRP) Machine Learning group. We are looking for a talented scientist who is passionate to build advanced algorithmic systems that help manage safety of millions of transactions every day. Key job responsibilities Use machine learning and statistical techniques to create scalable risk management systems Learning and understanding large amounts of Amazon’s historical business data for specific instances of risk or broader risk trends Design, development and evaluation of highly innovative models for risk management Working closely with software engineering teams to drive real-time model implementations and new feature creations Working closely with operations staff to optimize risk management operations, Establishing scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation Tracking general business activity and providing clear, compelling management reporting on a regular basis Research and implement novel machine learning and statistical approaches
US, WA, Seattle
We are building GenAI based shopping assistant for Amazon. We reimage Amazon Search with an interactive conversational experience that helps you find answers to product questions, perform product comparisons, receive personalized product suggestions, and so much more, to easily find the perfect product for your needs. We’re looking for the best and brightest across Amazon to help us realize and deliver this vision to our customers right away. This will be a once in a generation transformation for Search, just like the Mosaic browser made the Internet easier to engage with three decades ago. If you missed the 90s—WWW, Mosaic, and the founding of Amazon and Google—you don’t want to miss this opportunity.
US, WA, Seattle
We are building GenAI based shopping assistant for Amazon. We reimage Amazon Search with an interactive conversational experience that helps you find answers to product questions, perform product comparisons, receive personalized product suggestions, and so much more, to easily find the perfect product for your needs. We’re looking for the best and brightest across Amazon to help us realize and deliver this vision to our customers right away. This will be a once in a generation transformation for Search, just like the Mosaic browser made the Internet easier to engage with three decades ago. If you missed the 90s—WWW, Mosaic, and the founding of Amazon and Google—you don’t want to miss this opportunity.