AI is the hottest workplace skill at the moment, but where should you start to enter this high-paying field? An AI engineering expert shared a learning roadmap for AI engineers, telling workers who want to enter the field of AI engineering how to improve their capabilities step by step and master the required skills.
This AI engineer learning roadmap is written by Aurimas Griciūnas, a European AI engineering expert. He has more than 10 years of experience in the data field and was the product leader of Neptune, a model training analysis tool company. The company was acquired by OpenAI in December this year.
Griciūnas pointed out that although many people are confused about entering the field of AI, it is not too late to get started. The role of AI engineer has not been around for a long time and is still evolving rapidly. Standing out in a highly competitive field requires a clear path to development and a focus on key skills.
Before becoming an AI engineer, you must first become an engineer. Griciūnas emphasized that although basic abilities are the key to career development, in the era of rapid change, it is no longer feasible to lay a solid foundation first and then learn advanced knowledge. You must learn while practicing.
It is recommended to be familiar with the programming languages **Python** and **Bash**, and master the following tools:
- FastAPI
- Pydantic
- uv
- git
- Asynchronous programming
- CLI tool packaging and running FastAPI
In addition, you must also learn statistics and machine learning knowledge. These mathematical and statistical foundations can help understand statistical model concepts and facilitate the evaluation of large language model systems.
While learning to call APIs, you must understand the differences between different models, such as:
- The difference between the basic version and the upgraded version
- The areas of expertise of each model
- The characteristics of inference models, multi-modal models, etc.
You must also master structured output and prompt word caching (Prompt Caching)—by storing past prompt words and model responses, you can speed up the response and reduce costs.
The core goal is to make large language models exhibit expected behavior. This is not to train a new model, but to make the model stable and controllable in actual tasks by adjusting prompts and tool usage. Griciūnas calls this model adaptation.
Prompt engineering is the core skill, including:
- Correct Prompt structure (different models and tasks require different writing methods; format, word order, and context will all affect the results)
- Understand the context length limit, and adjust Prompt and data sharding
- Various prompting techniques, such as:
- Chain of Thought (guided model step-by-step reasoning)
- Tree of Thought (let the model explore multiple reasoning paths)
- Few-shot (provide examples to define task formats)
- Advanced skills: Learn to use tools such as Self-consistency, Reflection, ReAct (start with a simple framework)
Fine-tuning the model is only used when necessary, and it is recommended to practice from simple tools such as Unsloth.
From simply calling the API to building a system, the watershed lies in connecting the model to external memory.
- Use vector databases: Understand the advantages and disadvantages of vector similarity search and different indexing methods
- Learn graph databases (currently limited applications and high costs)
- Hybrid retrieval: Combine the advantages of keyword retrieval and semantic retrieval
The core is to let the model use external data instead of just relying on built-in knowledge, that is, Retrieval-Augmented Generation (RAG).
- Data preparation: chunking, adding metadata (such as tags, sources, categories)
- Vector retrieval: text is turned into vector storage and similarity search
- Combining retrieved data to generate answers: add fragments to Prompt, or limit the model to only use the provided data to answer
Let AI not only answer questions, but also plan, decide and execute autonomously, becoming an AI agent.
Common design patterns:
- ReAct (think while acting)
- Task Decomposition (break big problems into small steps)
- Reflexion (agent self-reflection and correction)
- Planner-Executor/Critic-Actor (planner vs evaluator division of labor)
- Hierarchical/Collaborative (hierarchical or collaborative mode)
Learn the agent's short-term memory (current conversation/task history) and long-term memory (cross-session information, such as user preferences). To ensure that decisions are safe and reliable, manual confirmation is added to key steps.
Integrate the previous capabilities to build an operational and maintainable AI system. Need to understand Docker and Kubernetes.
- Select computing resources: GPU/CPU solutions of cloud services (AWS, GCP, Azure), select based on model size and request volume
- Continuous integration and continuous deployment (CI/CD): automated testing and deployment
- Model routing strategy: allocate tasks to the most suitable model, and switch to backup in case of failure (recommended tools such as liteLLM, Orq or Martian)
Ensure that the system is transparent and can understand behavior and measure performance. You can't optimize without evaluation. Learn the basics of LLM observability and choose appropriate evaluation timing and methods under cost constraints. There are many off-the-shelf platforms available.
Pay attention to whether the system will be maliciously exploited or bring real risks. Learn to protect LLM input/output, test model-driven applications, and try to jailbreak or break through protection.
Pay attention to trends in the field of AI:
- Voice, vision and robots: Agent-based AI integrates multiple functions and interacts with the real world. Optimistic about device-side agents, extreme model quantification, and basic models optimized for robots
- Automatic prompt project: In the future, prompts will no longer be manually fixed, the system will automatically generate/adjust them, and only need to prepare test data sets to evaluate the effect.
1. Anchoring Positioning is not off track: With "AI + testing" as the core, focus on using AI to improve testing efficiency (such as automation, defect prediction) or carry out AI system testing (data/model/robustness testing). There is no need to blindly delve into AI algorithm research and development, and test engineers' business understanding and problem solving advantages can be used.
2. Prioritize skills: First add Python (essential tool) + basic statistics/SQL (data processing), then learn the core of AI testing (TensorFlow/PyTorch basics, AI testing framework such as EvidentlyAI), and finally understand large model API calls (such as ChatGPT auxiliary use case generation), do not be greedy for more and seek completeness.
3. Implement from existing work: First pilot AI applications in test scenarios (such as using AI to generate test cases, automated UI image recognition testing, and interface abnormal data prediction), and then try to participate in the testing of AI projects (focusing on data quality, model bias, boundary scenarios, and adversarial sample testing) to practice learning.
4. Grasp the particularities of AI testing: Compared with traditional testing, pay extra attention to data authenticity, model stability (output consistency under different inputs), algorithm security (such as privacy leaks, adversarial attacks), and establish a testing mindset for AI systems.
Gradually advanced around "from basic tools to practical implementation", focusing on the "AI + testing" scenario, not blindly delving into algorithms, and taking into account both practicality and operability.
Core Goal: Master the necessary tool usage capabilities and data processing basics for getting started with AI testing, and lay a solid foundation for subsequent AI testing tool practical operations and data verification scenarios.
Essential Skills:
- Python core: Proficient in syntax, functions, classes, focusing on common testing modules such as pytest, requests, etc. to meet the needs of automated script writing.
- Basic statistics: Understand mean, variance, hypothesis testing, common data distribution, and be able to understand the basic evaluation indicators of AI models.
- SQL skills: Master data query, filtering, and aggregation operations, and be able to extract data from the test environment database and complete data consistency verification.
Tools/Framework:
- Programming tools: VSCode, Jupyter Notebook
- Data tools: MySQL, Pandas, NumPy
- Learning tools: LeetCode simple Python question bank, SQL must-know and practice platform
Practical Project (Close to Testing Work):
- Use Python to batch process test logs, extract high-frequency error reporting modules and defect related data, and assist in analyzing test key points.
- Use SQL to query the test environment database, verify the integrity and consistency of business data, and simulate data verification scenarios in AI testing.
Core Objectives: Master the core logic and special tools of AI testing, be able to independently conduct basic AI system testing, and understand the core differences between AI testing and traditional testing.
Essential Skills:
- Machine learning basics: Understand the basic concepts of classification and regression models, and be familiar with the complete process of model training and testing, without in-depth algorithm derivation.
- Core dimensions of AI testing: Master core testing directions such as data quality (completeness, unbiasedness), model stability (output consistency under different inputs), robustness (anti-interference ability), etc.
- Model evaluation indicators: Understand the meaning of accuracy, recall, F1-score, drift detection and other indicators, and be able to judge the performance of AI models through indicators.
Tools/Framework:
- Basic framework: TensorFlow, PyTorch (entry-level mastery, only capable of loading and calling models)
- AI testing tools: EvidentlyAI (model drift detection), GreatExpectations (data quality verification), TestGPT (intelligent use case generation)
- Auxiliary tool: Scikit-learn (simple model training, used to simulate AI test scenarios)
Practical Project (Close to Testing Work):
- Carry out data quality testing for the company’s existing AI functions (such as recommendation systems, intelligent customer service) to verify the integrity and unbiasedness of the input data.
- Use EvidentlyAI to compare the accuracy changes of the new and old versions of the recommendation system and detect model performance drift.
- Use TestGPT to generate interface test cases, compare them with manually written use cases, and analyze efficiency differences and scene coverage completeness.
Core Goal: Combine large models to improve testing efficiency, be able to handle complex AI scenario testing, and build a simple AI testing automation process.
Essential Skills:
- Large model API calling: Master the API parameter configuration and context design of large models such as ChatGPT, Tongyi Qianwen, etc., and be able to flexibly call large models to assist testing.
- Basics of adversarial samples: Understand simple adversarial attack ideas (such as text synonymous replacement, slight image tampering), and be able to carry out basic adversarial testing.
- AI test automation script: Can integrate large models and testing tools, and write end-to-end AI test automation scripts.
Tools/Framework:
- Large model tools: ChatGPT API, Tongyi Qianwen API
- Automation tools: Selenium+OpenCV (image recognition test), Postman (API call)
- Adversarial testing tools: TextAttack (text confrontation testing), Foolbox (image confrontation testing)
Practical Project (Close to Testing Work):
- Write a Python script to call ChatGPT API, batch generate multi-scenario test cases (including normal and abnormal scenarios), and integrate it into the existing test process.
- Conduct adversarial testing on AI image recognition functions (such as verification code recognition, product image classification), and verify the model's fault tolerance by slightly modifying the image.
- Build a simple AI test automation process: use AI to generate use cases → execute automated tests → use models to analyze test results and screen out high-frequency defects.
1. Python + statistics: Coursera "Python for Everybody", Bilibili "Basics of Statistical Learning (Li Hang)" explanation video
2. AI testing: EvidentlyAI official documentation (practical tutorial), GreatExpectations Chinese guide
3. Large model API: OpenAI official documentation (including API call examples), Tongyi Qianwen developer documentation
1. There is no need to read complex deep learning papers, focus on "how to use tools for testing", with solving actual testing problems as the core.
2. Every tool or skill learned must be implemented in a specific test project (even a simulation project) to avoid "unused optics".
3. Give priority to using AI to solve repetitive problems in work (such as writing repeated use cases, log analysis), and then expand into complex scenarios to enhance the sense of learning achievement.
(From: TesterHome)