DATA & AI CAREER ROADMAP · 2025–2026

Cloud
Literacy
Roadmap

A practical, honest guide to building a job-ready Data & AI career — with cloud as the foundation, not an afterthought.

beginner-friendly cloud-first portfolio-driven job-market-aligned 2025–2026
Mohammad Reza Sharifi (Mo) Data Solutions Engineer · Serios Group · Newcastle upon Tyne

What being job-ready actually looks like

Before you start, be clear on where you're heading. Here's what employers actually look for in 2025–2026 — not the polished job description version, but the real picture.

Technical Capabilities
what you can DO
  • Write Python to clean, transform, and analyse data
  • Query databases with SQL confidently
  • Build and deploy data pipelines to the cloud
  • Read and navigate cloud platform documentation
  • Debug broken pipelines, not just write new ones
  • Understand how data moves end-to-end in a system
  • Use version control (Git) as a daily habit
Portfolio Evidence
what you can SHOW
  • 2–3 end-to-end projects on GitHub (not just notebooks)
  • At least one project deployed to a cloud platform
  • Clear READMEs explaining what, why, and how
  • Evidence of working with real or realistic datasets
  • Documentation that shows how you think, not just what you built
Cloud Fluency
what you UNDERSTAND
  • Know what compute, storage, and managed services mean
  • Comfortable using at least one major platform (Azure, AWS, or GCP)
  • Understand data flow: source → pipeline → storage → output
  • Can explain your architecture decisions simply
  • Not afraid of cloud consoles, CLIs, or error logs
Professional Skills
how you OPERATE
  • Can explain technical work to non-technical stakeholders
  • Comfortable reading documentation independently
  • Iterative mindset — ships something, improves it
  • Online presence that shows ongoing learning
  • Asks the right questions, not just executes tasks
Why cloud literacy is non-negotiable

Data and AI don't live on your laptop. They live on cloud platforms. Every serious employer runs their data infrastructure on Azure, AWS, or GCP. If you can only run a Jupyter notebook locally, you're not job-ready — you're experiment-ready. These are different things.

The roadmap — phase by phase

A step-by-step path from foundations to job-ready. Each phase builds on the one before it. Don't rush — the early stages matter far more than they feel like they do.

PHASE 01
Foundations
4–6 weeks

Get the mental model right before diving into technical skills. If you rush through this, everything else becomes shakier than it needs to be. Focus on how systems actually work — not just on writing code.

Python Basics
Variables, loops, functions, lists, dictionaries. Goal: write a script that reads a CSV file, cleans it, and outputs a summary. Use pandas jupyter
How Systems Work
What is a server? What is an API? How does data move over the internet? Understanding this makes cloud concepts click immediately.
Git & GitHub
Version control is not optional — it's a professional baseline. Learn git init git commit git push. Every project you ever build goes here.
Command Line
Terminal basics: navigating directories, running scripts, reading outputs. You'll use this constantly when working with cloud platforms.
PHASE 02
Data Skills
4–6 weeks

The core craft of working with data. Real datasets are messy, incomplete, and confusing. Learn to work with data as it is, not as you wish it were.

SQL
SELECT, WHERE, JOIN, GROUP BY, aggregations. SQL is the most underrated skill in data. Nearly every data role uses it daily. Practice on real data with SQLite or PostgreSQL
Data Analysis
Load, clean, explore, and summarise datasets. Build the habit of asking "what does this actually mean?" before charting anything. Use pandas numpy
Visualisation
Turn numbers into insight. Charts that communicate beat charts that impress. Use matplotlib seaborn plotly
Statistics Basics
Mean, median, distributions, correlation, outliers. You don't need a statistics degree — you need enough to stop misreading your own data.
First Real Project
Take a public dataset (Kaggle, data.gov, ONS). Clean it. Analyse it. Write a README explaining what you found and why it matters. Push to GitHub.
PHASE 03
Cloud Fundamentals
★ CORE FOCUS · 4–6 weeks

This is the phase most people skip — and it's exactly why they struggle to get hired. Cloud isn't something you bolt on at the end. It's the environment where data and AI actually run. Get comfortable here before building anything that matters.

Pick ONE platform and commit. Azure, AWS, and GCP all have free tiers. For data/AI roles in the UK, Azure is particularly valuable. For global roles, AWS has the broadest demand. GCP has strong ML tooling. Don't try to learn all three at once — depth beats breadth at this stage.
What Cloud Is
Cloud = someone else's computers, available over the internet, billed by use. The key shift: instead of running things on your laptop, you run them on infrastructure that can scale, persist, and be shared. That's it.
Compute
Virtual machines and managed runtimes. Azure Functions AWS Lambda Cloud Run — run code without managing servers. Build: deploy a Python function to the cloud.
Storage
Blob/object storage for files and data. Azure Blob AWS S3 GCS — this is where real data pipelines stage their data. Build: upload and retrieve files programmatically.
Databases
Managed cloud databases vs. local ones. Understand the difference between SQL databases, NoSQL, and data warehouses. Azure SQL BigQuery RDS
Identity & Access
Who can access what? IAM roles, service principals, API keys. This trips up beginners constantly. Understand it early — it affects every cloud project you build.
Reading Docs & Errors
The cloud is self-service. Navigating official documentation and reading error messages calmly is a genuine, rare skill. Practice: deliberately break something, then fix it using only the docs.
Recommended starting paths

Microsoft Azure: AZ-900 (free on Microsoft Learn) → Azure Data Fundamentals (DP-900) → build a data pipeline in Azure Data Factory

AWS: AWS Cloud Practitioner (free on Skill Builder) → Data Analytics Fundamentals → build a pipeline using S3 + Glue + Athena

GCP: Cloud Digital Leader → BigQuery basics → build an end-to-end pipeline with Cloud Storage + Dataflow

PHASE 04
Practical Implementation
6–8 weeks

Stop doing isolated tasks. Start building end-to-end systems. The goal here is one complete pipeline that works in the real world — data in, insight out, deployed in the cloud.

End-to-End Pipeline
Ingest data from a source → store it in cloud storage → transform it → load it into a data warehouse → visualise it. Build this once and you understand how real data systems work. Azure Data Factory Airflow dbt
APIs & Data Sources
Pull data from a real API (weather, finance, sport, public services). Understand HTTP, authentication, rate limits. requests httpx
Containers
Package your code so it runs the same everywhere. Docker is the lingua franca of deployment. Learn to build an image, run a container, push to a registry.
Monitoring & Logs
Real pipelines break. Learn to read logs, set up alerts, and understand what "healthy" looks like before things go wrong. Azure Monitor CloudWatch
Testing
Your pipeline is not done until it has tests. At minimum: test your transformations, test your data quality. pytest great_expectations
PHASE 05
AI & Advanced
ongoing

Once you have strong data foundations and cloud fluency, AI tools become genuinely accessible. Without them, AI is just API calls you don't understand.

ML Fundamentals
Supervised vs. unsupervised learning. Classification, regression, clustering. Train a model, evaluate it, understand what the metrics mean. scikit-learn
Cloud ML Services
Managed ML platforms let you train and deploy models without infrastructure complexity. Azure ML SageMaker Vertex AI Build: train and deploy a model via the cloud, not your laptop.
LLMs & APIs
Use existing models via API rather than training from scratch. Understand prompting, context, and limitations. Build something useful with them. OpenAI API Azure OpenAI
Microsoft Fabric / Databricks
Modern enterprise data platforms that unify data engineering, analytics, and AI. Increasingly expected in job specs. Microsoft Fabric Databricks

Skills that actually matter

Three layers — all essential. Technical skills get you the interview. Process skills keep you employed. Professional skills shape where you go from there.

Technical
gets you in the door
  • Cloud platforms (Azure/AWS/GCP)
  • Python & data libraries
  • SQL
  • Data pipelines & ETL
  • Git & CI/CD
  • Containers & Docker
  • APIs & integration
Process
keeps you employed
  • Systems thinking
  • Debugging & root cause
  • Reading documentation
  • Data quality & testing
  • Monitoring & observability
  • Iterative delivery
  • Writing clear specs & docs
Professional
drives your growth
  • Explaining tech to non-tech
  • Adaptability & curiosity
  • Resilience (things break)
  • Asking the right questions
  • Continuous learning mindset
  • Portfolio visibility
  • Cross-functional collaboration

Learning resources that are worth your time

This isn't everything that exists — it's what's genuinely worth your time. Each one comes with a note on how to actually use it, not just that it exists.

STRUCTURED LEARNING
Free, official, and surprisingly good. Start with AZ-900 (Azure Fundamentals) — it gives you the mental model for cloud without any cost. Then DP-900 for data. Complete the hands-on sandboxes — don't just read. The sandbox environments let you use a real Azure console for free. If a lab feels confusing, that discomfort is the learning.
Free Azure Sandbox labs Certifications
HANDS-ON LABS
AWS's free learning platform. Start with Cloud Practitioner Essentials, then the Data Analytics Fundamentals course. The labs give you real AWS consoles without a credit card. Rule: every video must be followed by doing the thing in the console yourself — not just watching.
Free tier AWS Interactive labs
DATA PRACTICE
Use Kaggle's free micro-courses — Python and SQL are genuinely useful (3–5 hours each). Then explore real datasets and ask a specific question of each one. Don't just run notebooks — write a README explaining what you found and what it means. Treat every dataset like a real business problem.
Free Datasets Notebooks Competitions
STRUCTURED TRACKS
Best for building Python and SQL fluency through structured tracks. The Data Engineer and Data Analyst with Python tracks are well-paced. Use with a subscription (often discounted or free via student programs). The golden rule: type every example yourself, then close the browser and rebuild it from scratch. Passive watching = forgetting within 48 hours.
Subscription Python SQL Pipelines Certificates
BROAD COURSES
Strong for industry-backed certifications. IBM Data Engineering Professional Certificate and the Google Data Analytics Certificate are well-regarded. Audit for free if you don't need the certificate. Apply for financial aid if cost is a barrier — it genuinely works. Rule: only start a course with a specific skill gap in mind.
Audit free Industry certs Financial aid available
BROAD COURSES
Never pay full price — sales happen constantly and courses are typically £10–15. Best choices: highly-rated Azure courses by Scott Duffy or Alan Rodrigues, and data engineering courses by Frank Kane. Check reviews carefully — quality varies. Look for courses updated within the last 12 months. Avoid courses that are mostly slides with no hands-on labs.
Paid (discounted) Azure / AWS Practical focus
PORTFOLIO
Not just code storage — your professional portfolio. Every project needs a README that explains the problem, your approach, and what you learned. Pin your best 4–6 repos (including small projects and coursework). Commit regularly — your contribution graph tells a story about consistency. A green streak matters less than a few well-documented repos.
Free Portfolio Version control GitHub Pages
CLOUD PRACTICE
Free Cloud Tiers
Azure Free Account · AWS Free Tier · GCP Free Tier.

Use these to build real things, not just follow tutorials. Set a budget alert immediately (£5 limit) so you don't accidentally accrue costs. Mistakes happen — an alert means they're cheap mistakes.
Free tier Real infrastructure Set budget alerts
READING
"Fundamentals of Data Engineering"
Joe Reis & Matt Housley (O'Reilly, 2022). The clearest explanation of how modern data systems are designed and why. Don't read it cover to cover upfront — dip into the chapters that match what you're building. The vocabulary it gives you is worth the read alone. Available on O'Reilly with a free trial.
Book O'Reilly Architecture Essential

Building your portfolio and getting visible

Your work only counts if people can find it. Employers notice visibility, consistency, and curiosity — not just polished finished projects. Everything you build and learn is worth putting out there.

💡
Don't wait for a "proper" project. A Power BI dashboard from a course, a SQL query you wrote for coursework, a small ML model you trained following a tutorial — all of it counts. The habit of sharing early is more valuable than waiting until something feels perfect. It never will. Share it anyway.

Sharing your work — the full spectrum

From your very first script to a production pipeline — here's how to think about visibility at each stage.

📊
Course Assignments & Dashboards
Power BI dashboards, Tableau exercises, SQL coursework, Excel analyses. These are real skills and real work.
Share a screenshot on LinkedIn. Describe what the data showed and what you found interesting. You don't need to frame it as a "project" — frame it as learning in action.
🧪
Small ML Models & Experiments
A sentiment classifier on movie reviews. A simple regression on housing prices. A clustering model on any dataset you care about.
Push to GitHub. Write 3 sentences about what you tried, what worked, and what you'd do differently. That honesty is more impressive than polished silence.
🔧
Scripts & Automation
A Python script that pulls data from an API. A SQL query that answers a business question. A notebook that cleans a messy dataset.
Add a README. Explain what problem you were solving. Even a 50-line script on GitHub shows you can write real code with purpose.
☁️
Cloud Experiments
Setting up a storage account. Connecting a database. Running a cloud function. Following a Microsoft Learn lab and actually doing it.
Screenshot your Azure/AWS console with something running. Post it with a short description. "Today I deployed my first function to Azure. Here's what I learned about IAM the hard way."
📜
Certificates & Completions
AZ-900, DP-900, DataCamp track completions, Coursera specialisations, Kaggle micro-courses. These signal structured effort and commitment.
Don't just post the badge. Write one paragraph: what you learned, one thing that surprised you, and how you're going to use it. That's the content. The badge is just the image.
🏗️
End-to-End Projects
A real pipeline. A deployed model. A data product someone can actually use. This is the gold standard — but it doesn't need to be first.
This deserves a full case study: what was the problem, what decisions you made, what broke, what you learned, and a link to the running thing. Architecture diagrams welcome.

Certificates — doing them right

Certificates aren't the goal — but they're evidence of structured effort. The difference between a certificate that helps you and one that doesn't is what you did with the learning.

AZ-900 Azure Fundamentals
DP-900 Data Fundamentals
AWS Cloud Practitioner
AWS Data Analytics Fundamentals
DataCamp Data Engineer Track
DataCamp Python Track
Google Data Analytics (Coursera)
IBM Data Engineering (Coursera)
Kaggle Python Micro-Course
Kaggle SQL Micro-Course
The rule: for every certificate you earn, build or apply one concrete thing from it — then share both. "I completed DP-900 and built a small data pipeline using Azure Data Factory" is worth ten times more than a badge with no context. The combination signals real learning, not just course completion.

Projects worth building

These four project types cover the full range of skills employers look for. You don't need all four — two solid ones beat four half-finished ones.

1
End-to-End Data Pipeline
Ingest real data from a public API, store it in cloud blob storage, transform it with Python, load it into a cloud database, visualise with a dashboard. Document the architecture with a diagram.
Python Azure Blob / S3 SQL Data Factory / Airflow Power BI / Grafana
2
Cloud-Deployed ML Model
Train a model on a meaningful dataset (not MNIST). Deploy it as a REST API in the cloud. Make it callable from the outside world. Write up the model's limitations honestly.
scikit-learn / PyTorch FastAPI Docker Azure Container Apps / Lambda
3
Data Quality Monitor
Build a scheduled pipeline that checks a dataset for quality issues (missing values, schema drift, outliers) and sends an alert if something breaks. This demonstrates production-level thinking.
Python great_expectations / pytest Cloud Functions Email / Slack alert
4
Domain-Specific Analysis
Pick a domain you care about (healthcare, sport, climate, education). Find real data. Ask a real question. Answer it. Write it up as a proper case study. This shows taste and judgment, not just technical skill.
Python SQL Visualisation GitHub Written case study

How to present your work

Building the project is half the work. How you document and share it is the other half. Employers read READMEs. Recruiters look at LinkedIn. Visibility is career infrastructure.

GitHub — your portfolio hub
  • Pin your 4–6 best repos (including course work and small experiments)
  • Every repo needs: what it does, why you built it, how to run it, what you learned
  • Include an architecture diagram — even a hand-drawn photo is fine
  • Show your commit history — it's a log of your growth and consistency
  • Small scripts belong here too. A clean, documented 80-line script shows professionalism.
LinkedIn — keeping people updated
  • Post when you finish a project, course, or even a tricky problem
  • Share screenshots with context — "here's what this dashboard shows and why I built it"
  • Certificate posts: add one specific thing you learned and one thing you'll apply
  • Engage with data practitioners genuinely — comments build visibility faster than posts
  • Your headline: describe what you DO and what you're building toward, not just your job title
Writing — it sharpens your thinking
  • Write about what you struggled with, not just what worked
  • Medium is a good starting point — low friction, real audience, no setup
  • "I spent 3 hours debugging this Azure connection issue — here's what it taught me" is better content than a tutorial
  • Reference: "Why Cloud Literacy Is No Longer Optional in Data/AI Careers" — start from your own experience, like this

Getting out there — events, community, and engagement

Learning in public and showing up in the community signals curiosity and commitment. These soft signals matter more than you'd think — especially early in your career.

IN PERSON
Meetups & Events
  • Attend local tech meetups — BBC North East, PyData, DataIRL
  • Go to hackathons even if you don't finish anything
  • University seminars and industry talks count
  • After attending: post a takeaway on LinkedIn
READING & LISTENING
Stay Informed
  • Read technical blogs: Towards Data Science, The Batch, Azure blog
  • Podcasts: Data Engineering Podcast, DataFramed, Lex Fridman (ML-focused)
  • Follow practitioners on LinkedIn who share real-world problems
  • After reading: note one thing you'll explore further
REFLECTION
Turn Inputs Into Outputs
  • After a podcast: write 3 sentences on what resonated
  • After a blog post: try one thing it described
  • After a meetup: connect with one person on LinkedIn with a real message
  • Share your reflection — even a short post counts
ONLINE COMMUNITY
Contribute & Learn
  • Answer questions on Stack Overflow (even simple ones)
  • Join Discord communities: dbt, Prefect, Azure community
  • Contribute small fixes or documentation to open source repos
  • Comment on posts thoughtfully — this is how people find you

Mistakes that slow people down

These are the patterns that come up again and again. Spotting them early saves a lot of wasted time.

Certificate Hoarding
Doing five courses to feel ready before building anything. Fix: build something concrete after every course you finish. One deployed project is worth more than three certificates.
Avoiding the Cloud
Staying in local Jupyter notebooks and never venturing outside them. Fix: deploy something — anything — to a cloud platform this week. The discomfort fades quickly. The gap on your CV doesn't.
Toy Projects Only
Building on the Titanic dataset or Iris flowers for the 8th time. Fix: find a domain you genuinely care about and work with real data from it.
Not Documenting Work
Uploading code with no README or explanation. Fix: if you can't explain what your project does in 3 sentences, you don't understand it well enough yet.
Breadth Before Depth
Skimming every cloud platform, every framework, every tool. Fix: go deep on one platform, one pipeline tool, one database. Breadth comes naturally once you have depth.
Building in Isolation
Never sharing your work, never getting feedback. Fix: post to LinkedIn, contribute to open source, attend local meetups (like this one!). Visibility compounds.
Waiting Until You're "Ready"
There's no moment where you suddenly feel ready. There's only building. Fix: ship something imperfect, then improve it. A project that actually exists — however rough — beats a perfect one that never ships.
Ignoring the Soft Layer
Assuming technical skill alone is enough. Fix: every technical decision needs a business explanation. Practice explaining your work to someone non-technical.

A realistic timeline — pick what fits your life

Speed matters less than consistency. Someone who puts in an hour a day for six months will do better than someone who goes hard for two weeks and burns out. Pick the version that fits your life and stick with it.

Month 1
Foundations + Python
Python basics, Git, command line. Build a script that reads and transforms a CSV. Push it to GitHub with a proper README. Output: 1 GitHub repo, functional Python foundation.
Month 2
Data Skills + First Project
SQL, pandas, data cleaning, basic visualisation. Use a real dataset from Kaggle or a public API. Output: end-to-end analysis published on GitHub.
Month 3
Cloud Fundamentals
AZ-900 or AWS CCP. Deploy your Phase 2 project to the cloud. Store data in blob storage. Connect to a cloud database. Output: project running in the cloud, not on your laptop.
Month 4–5
End-to-End Pipeline
Build a complete data pipeline: API → cloud storage → transformation → cloud DB → dashboard. Add monitoring. Add tests. Output: portfolio project 2, cloud-deployed, documented.
Month 6
Portfolio Polish + Job Search
Write up both projects as case studies. Update LinkedIn. Start applying. Aim for roles with "data engineer", "data analyst", or "cloud data" in the title. Output: 2 polished projects, active applications.
M 1–2
Foundations (Deeper)
Python, Git, command line, basic web concepts. Build 2–3 small Python scripts. Work through Kaggle's Python and SQL courses fully. Output: solid fundamentals, no shortcuts.
M 3–4
Data Skills + Analysis Projects
SQL (intermediate level), pandas, statistics, visualisation. Build 2 data analysis projects on domains you care about. Output: 2 GitHub repos with real analysis and written findings.
M 5–6
Cloud Fundamentals (Full)
Complete AZ-900 or AWS CCP. Work through the data fundamentals certification path. Build hands-on labs. Deploy your existing projects to the cloud. Output: cloud certification + deployed projects.
M 7–8
Pipeline Engineering
End-to-end pipeline with scheduling, monitoring, and tests. Learn Docker. Work through a data engineering course. Output: production-quality pipeline project on GitHub.
M 9–10
AI/ML Layer
ML fundamentals, cloud ML services, deploy a model. Explore Fabric or Databricks. Build your ML portfolio project. Output: deployed ML project, cloud ML service experience.
M 11–12
Portfolio + Brand + Apply
Polish all projects. Write case studies. Build LinkedIn presence. Start writing on Medium. Active job search with strong portfolio backing. Output: job-ready portfolio, active applications, visible online presence.
Consistency beats intensity. Forty-five focused minutes every day is worth more than a six-hour weekend session followed by nothing for two weeks. The people who get there aren't the fastest — they're the most consistent.

Where you'll be at the end of this

By the end of this, here's what you'll actually be able to do — not just describe in an interview.

🔨
Build
  • End-to-end data pipelines in the cloud
  • Deployed APIs and ML model endpoints
  • Monitored, tested data workflows
  • Documented, reproducible projects
  • Dashboards that answer real questions
💬
Explain
  • How data flows through a cloud system
  • The trade-offs in your architecture choices
  • What your model does and what it doesn't do
  • Why cloud matters for data and AI work
  • Your own journey, honestly and clearly
🚀
Deploy
  • Code that runs in cloud environments
  • Pipelines that run on a schedule
  • Models served via API
  • Containers that work across environments
  • Projects visible to anyone with a browser
FINAL MESSAGE

Cloud is the environment
where data and AI live.
Understanding it is not optional —
it is foundational.

You don't need a perfect background or another qualification. You need to build real things, put them somewhere real, and show your work. The gap between where you are and where you're headed is absolutely bridgeable — one project at a time.

Created by Mohammad Reza Sharifi (Mo) · Data Solutions Engineer · Serios Group · Newcastle
github.com/Mohrezasharifi  ·  medium.com/@rezatayeb2016  ·  @rezapy2020