I'm self-taught too; my CS degree happened before anybody heard of Python. Python is far too broad to ever have a deep understanding across the board, of each and every use case. You could take 4 years of classes in Python and find that only 10% is useful for what you are working on. If you want general Python info useful to DS, I think that these are some good general skills to focus on: Libraries, Packages, and Modules Data structures Vectorization Debugger Working in venv API calls SQL calls These will give you a foundation to set up an environment, import a bunch of data, efficiently process the data, and produce some useful output. Optimize from there. Answer from Able_Excuse_4456 on reddit.com
🌐
Reddit
reddit.com › r/learnpython › python for data analysis
r/learnpython on Reddit: Python for Data Analysis
April 4, 2023 -

Hey guys! So I work as a Data Analyst with SQL, Tableau and Excel but I would like to take the next step which is programming with python for Data Analysis.

Can anyone recommend a course or bootcamp for this? I have 0 programming experience btw. I dont want to learn things out of my scope like creating apps or scraping the web.

I want to learn something useful for my job. For example I would like to be able to predict when a user will churn, how to predict the customer lifetime value, how to do machine learning in ways that will help the company and make me more valuable. I believe I need to learn pandas,numpy, seaborn, pyspark, tensorflow, matplotlib etc.

Thanks for the help!

🌐
Reddit
reddit.com › r/learnpython › how much python do i need to learn for data analysis?
r/learnpython on Reddit: How Much Python Do I Need to Learn for Data Analysis?
September 17, 2023 -

I work full-time and I’m continuing my informal education in Python, specifically for data analysis. I’m trying to gauge just how in-depth I need to go into Python’s core language features before I can use it professionally for data analysis.

From various classes and resources, there’s an extensive amount to learn, and I think people believe you can’t start using Python until you have proven with a certificate that you can write from scratch. However, is it reasonable to focus primarily on understanding and implementing data analysis libraries (like pandas, numpy, and matplotlib) instead of perfecting my broader Python knowledge? I'm wondering if, by diving into these libraries and working on my professional tasks, I will pick up the nuances of Python indirectly.

Is this a viable approach? Can one become proficient in data analysis with Python by diving directly into these libraries, or is a deeper foundational understanding of the language essential before progressing?

🌐
Reddit
reddit.com › r/learnpython › learning python for data analysis
r/learnpython on Reddit: Learning python for data analysis
February 17, 2025 -

Strange-ish scenario here but I just got hired as an Data Analyst for a company with no experience. The boss knows this and is fine with it, he is a close friend and wants to help me start a profession. He told me to throw some data sets in chat gpt to help me and slowly learn how to use python as I go. Anyone know some good textbooks or online programs I could check out to help expedite this process? I also have no background in python LOL.

🌐
Reddit
reddit.com › r/dataanalysis › what are the most important uses of python for data analytics?
r/dataanalysis on Reddit: What are the most important uses of Python for data analytics?
October 3, 2023 -

I feel like a lot of my job in data analytics is simply looking at various metrics across different dimensions — e.g: which customers/regions/ads/demographics are driving KPIs and which aren’t — and why?

I am still in the early stages of learning Python, but I feel like a lot of the stuff I’m learning can be done in Excel/SQL, albeit slower and not automated.

I would like to know — what are some types of analysis that Python can do that other tools can’t?

How do you analysts use Python in their day-to-day job?

What Python skills are most important to master for data analysis?

What’s considered ‘advanced data analysis’ (not data science AI/ML) in Python?

Top answer
1 of 33
62
My friends and I have actually been building a product in this space, so we have some ideas on these questions. Would be great to hear what you think though! 1. What can Python do that other tools cant't? A big one is large data tasks. SQL can definitely achieve this, but if you have a csv file with millions of rows in it , it's impractical to try to use Excel. Even opening a huge file could crash your computer lol. With Python you can much more easily handle such files by loading them into a pandas data frame. There are functions in pandas that are much easier than figuring out the right formulas in Excel. In some cases it saves you a lot of manual work. Here's a few examples we ran into: - Regex for turning dates into a standard format. e.g. November 21 --> 11/21/2022. Sometimes you can do this easily in Excel, but not always. - Transposing the data with respect to a subset of the columns. - Filling cells based on the first populated cell above it. 2. How do analysts use Python in their day-to-day job? We're actually still trying to figure this one out ourselves, but one of the patterns we've seen is that a lot of people will write scripts to clean up data files that come to them consistently messed up. Empty rows, mismatched headers and data, etc. I also think there's probably a lot of people using matplotlib and seaborn plots instead of using Excel templates, but like I said that's just a hypothesis. We're still investigating. 3. What Python skills are most important to master? I would probably say that you should have a decent grasp on matplotlib, seaborn, and pandas packages. But most importantly, you should just have an ergonomic way to run any python scripts that you are creating. Get a good interpreter like PyCharm, use a virtualenv that has all the main packages I mentioned above, create a GitHub directory for your scripts, and start practicing running the scripts locally on your csv files. Once you have this, you can do a lot of heavy lifting by copy and pasting your code into ChatGPT and asking it to help you write the functions you need for your job. Do NOT underestimate what ChatGPT can do for coding. These models are trained on large swaths of the internet, and there are loads and loads of python scripts publicly available. Since it came out in November, GPT-4 has probably 3xed my productivity. It will also help you learn much faster if you ask it to explain its work. 4. What’s considered ‘advanced data analysis’ (not data science AI/ML) in Python? I'm not really sure on this one. Hard to keep up with all these buzzwords! If you're interested in trying out the product we're building, it would be great to get some feedback ( app.squack.io )! Basically it's a Jupyter notebook that acts on csv files, and every block of code can be generated with commands in everyday language (e.g. "Can you get rid of any empty rows?"). Once you've built your notebook, you can reuse the code by clicking "Spawn Automation." For now you'll just have to keep track of the resulting url, but we're going to build an automation library soon. Hopefully this helps and good luck learning Python, you won't regret it!
2 of 33
9
Probably classification, modeling and visualization are some of the most powerful and typical use cases for python in data analysis. (I’m not a professional, but I do have a masters in business analytics). These are what we use python for in academia at the masters level, there are other languages to accomplish the same things also.
🌐
Reddit
reddit.com › r/learnpython › python for data analysts
r/learnpython on Reddit: Python For Data Analysts
February 20, 2024 -

Roadmap for Learning Python for Data Analyst

Hi I need to learn a python for data analytics. Can anyone share roadmap for it ? After learning it for data analytics i also want to learn it for automation tasks. If any Data Analyst is here also mention that. Thanks in advance.

Top answer
1 of 2
3
https://github.com/ossu/data-science might help with the roadmap.
2 of 2
1
A good roadmap for datascience is: Learn Python itself, focusing on the classes in Pythons builtins module and the general principles behind object orientated programming Some of the standard modules closely related to the builtins module such as collections and itertools Some of the standard libraries which are fundamental to datascience such as math, statistics random, datetime Some of the standard libraries related to file formats such as io, csv, json and pickle The main third-party datascience libraries such as numpy, pandas, matplotlib and seaborn The above is the foundation for machine learning and statistics Textbooks I recommend are: Python Distilled by David M. Beazley (O'Reilly Online) Python for Data Analysis by Wes McKinney An Introduction to Statistical Learning with Applications in Python by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibeshirani, Jonathan Taylor The books give an overview in Python itself, the datascience libraries and statistical learning respectively. The first is accessible from the publishers website using a free trial, the other two are open access. Udemy courses that I took and found useful were: 100 Days of Code by Angela Yu Udemy Python for Machine Learning & Datascience Masterclass by Jose Portilla Udemy Never pay full price for a Udemy course as they discount them every couple of days. The 100 days of code course will be pretty difficult for someone who has never programmed (but will be well worth the initial struggle). If it takes you more than 1 day to understand concepts and do exercises, at the beginning don't feel discouraged and don't try to rush things. It is instructor led to about day 40 and gives an overview in Python. Later it focuses more towards web development and is less useful for a data scientist, I stopped at day 40 and at that stage found it more appropriate to learn the datascience libraries. The Python for Machine Learning & Datascience Masterclass and the Python and Data Analysis book complement each other pretty well. The book is more detailed and is by the founder of the pandas library. I complete the rest of the 100 days of code subsequently just for the sake of completion (but fell I got the most out of it from the beginning sections). There are also a number of courses on Coursea by Andrew N.G. related to machine learning but you won't be able to make the most of these without having the perquisite skills such as a basic understanding of Python and the datascience libraries. Also if it helps I've put together a pretty detailed GitHub repository of Python Notebooks which cover Python and the datascience libraries mentioned above.
🌐
Reddit
reddit.com › r/dataanalysis › how much python do you use for data analysis on daily basis?
r/dataanalysis on Reddit: How much Python do you use for Data Analysis on daily basis?
October 29, 2023 -

I am currently on a self-learning journey of Data Analytics technical skills.

I am wondering if I should keep Python at the very end of my journey. I mapped my journey as follows:

  • Data Analytics, Visualization and Forecasting using Microsoft Excel (completed)

  • Google Data Analytics (completed)

  • Microsoft Power BI (80% completed)

  • SQL with MySQL (about half completed)

  • Tableau (quarter completed)

  • Python (not yet started but I have intermediate level knowledge about Python programming)

I would like some feedback from fellow redditors working in Data Analysis field whether they use Python on a regular basis. Your feedback and/or advice will help me greatly in correcting the course of my journey.

🌐
Reddit
reddit.com › r/dataanalysis › why is python so highly recommended for data analysis and visualization?
r/dataanalysis on Reddit: Why is Python so highly recommended for data analysis and visualization?
November 30, 2022 -

I’ve been in the data analysis and reporting field at non-tech companies for over a decade, but don’t know how to code. I’m looking to change that by learning R or Python to conduct more efficient and complex data analysis and visualization. All the Data Scientists I speak with wholeheartedly recommend R. Why then am I seeing so many articles and posts recommending Python?

Top answer
1 of 10
94
For a data analyst/data scientist, displaying only R or only Python on a resume underlines very clear professional boundaries : - Only R : you're a pure statistician/analysis person, usually with an academic background, R is your high tower and you're not going to get your hands dirty in the data infrastructure stuff. Mostly useful if, for example, the company already has a skilled data engineering team, able to perfectly deliver to you, and you're mostly interacting with the stakeholders and management. - Only Python : you're able to be closer to the data infrastructure people, exchange code with the data engineers to implement stuff into production. Would be weird to ask a data analyst profile to do some infrastructure stuff, but clearly some "data scientist" hires are asked to do that in smaller companies, and it usually involves a lot of Python. Python is also essential in Machine Learning, so even pure academic people need to learn it to go into this area. Clearly, R is a signal for a pure data analyst & math heavy profile, while Python implies, potentially, more general programming ability and versatility. If you want to stay in this pure data analyst role, I would say just go for R as it's the right signal and tool for this objective. But if you want to open some potential new paths for your professional career, for example, if you want to work with data engineering or machine learning at some point, you can't do that without Python. That's why a lot of data people have the capacity to switch between being data analyst, data scientist and data engineer, given a few years. You can't have these trajectories by only sticking to R.
2 of 10
14
I'm from software engineering background, prefers Python. My brother from statistics background prefers R. He says R is packed with statistician friendly libraries. Python is more popular than R hence why more people talk about it. Just an observation though.
🌐
Reddit
reddit.com › r/learnpython › python for data analysis courses recommendation.
r/learnpython on Reddit: Python for data analysis courses recommendation.
June 10, 2025 -

Hello everyone, I recently started a new position (got a promotion) at an environmental research company and part of my new job is to do data analysis.

I did similar work for my previous position in Excel but now I need to do more complex stuff in JupyterLab and Python/SQL. More exactly we have huge databases with thousands of companies which each have hundreds of data points and are assigned scores based on various factors. I would need to analyze this data and look for outliers, or trends in a certain industry, or if we change something to our methodology what impact it would have on the scores.

A colleague of mine recommended me datacamp.com and I did some of their free courses and they seemed ok but I don't really like the subscription model as I don't have that much time to spend each day. I've also seen Angela Yu's course mentioned a lot on this sub as a good starting point but it seems a bit overkill for what I need.

Worth mentioning that I have no previous experience in programming except for semi-advance Excel formulas if that count (from my initial interactions with python they do seem a bit similar).

Which one do you recommend going for, also worth mentioning is that I have an 800 euro educational stipend so while I would like to spend as little as I can from it so I can also do other stuff price is not really that much of an issue.

Thank you all for reading and have a great day!

🌐
Reddit
reddit.com › r/learnpython › seeking advice on learning python and r for data analysis
r/learnpython on Reddit: Seeking Advice on Learning Python and R for Data Analysis
October 15, 2024 -

Hi everyone,

I’m planning to learn Python and R with the goal of using them for data analysis and management.

I believe it’s important to first get a solid understanding of the foundations of both languages before diving into the specifics of data analysis.

I would really appreciate any advice or suggestions on:

  1. The best way to approach learning the foundations of Python and R.

  2. Any resources or courses that you recommend for beginners to build a strong foundation.

  3. Resources or courses that are specifically good for data analysis and management after mastering the basics.

Thanks in advance for your help!

Find elsewhere
🌐
Reddit
reddit.com › r/learnpython › how can i effectively learn python for data analysis as a complete beginner?
r/learnpython on Reddit: How can I effectively learn Python for data analysis as a complete beginner?
February 17, 2026 -

Hi everyone!

I’m completely new to Python and I'm particularly interested in using it for data analysis. I've read that Python is a powerful tool for this field, but I’m unsure where to start.

Could anyone recommend specific resources or beginner-friendly projects that focus on data analysis?
What libraries or frameworks should I prioritize learning first? I want to build a solid foundation, so any advice on how to structure my learning path would be greatly appreciated.

Thank you!

🌐
Reddit
reddit.com › r/learnpython › where to start with python for data analysis?
r/learnpython on Reddit: Where to start with Python for Data analysis?
May 28, 2025 -

Hey all,

I want to learn python to go into business analytics or data science, and I don't really know where to start with Python. Are there any online courses or videos you'd recommend, as well as what topics to start with and then go about.

As well as any general tips or anything to know about Python since I have very limited know6, thanks :)

🌐
Reddit
reddit.com › r/datascience › study general python vs. study python for data analytics
r/datascience on Reddit: study general python vs. study python for data analytics
June 26, 2022 -

Hi all, thank you for taking the time to answer this. My question is regarding where I should spend my time based on my goals:

I'm currently studying python from Angela Yu's 100days of code udemy course, but am not interested in making websites or apps as my goal is to use it for data analytics as a data/business analyst/data visualization. There is another udemy course by where python is used specifically for data analysis and visualization. Just wondering which I should focus on this instead. Both instructors are great so not looking for advice on their teaching style.

Angela Yu's Python: https://www.udemy.com/course/100-days-of-code/learn/lecture/17965120#reviews

python for data analytics: https://www.udemy.com/course/python-for-data-analysis-visualization/

🌐
Reddit
reddit.com › r/learnpython › learning python path for data analyst
r/learnpython on Reddit: Learning Python path for data analyst
October 17, 2021 -

Hello guys, im starting learning python with the hope of changing my career. I worked in other field which doesnt connect with data or IT. But i really like data.

Just finished a basic data analyst bootcamp of 365 team on udemy, have to say the course is amazing and makes a totally noob like me hooked.

Since i want to dwell deeper in this field, can you guy recommend courses or sources which i can improve my knowledge, ability also certificate for proper data analyst role?

🌐
Reddit
reddit.com › r/learnpython › [deleted by user]
I want to learn python for data analysis
September 19, 2024 - There are a lot of free resources you can find to learn python for the purpose. In my case, I was a complete beginner in Data as I had a non-tech academic background. I got enrolled in a data analyst virtual training and to strengthen my skills I went through freecodecamp. (link: https://www.freecodecamp.org/learn/data-analysis-with-python/ ) The will offer you a certificate as well.
🌐
Reddit
reddit.com › r/learnpython › data analysis resources for python
r/learnpython on Reddit: Data Analysis Resources for Python
May 15, 2020 -

Introduction

Data Science is an increasingly important tool for companies looking for competitive advantage, and Data Scientist jobs are coveted and often well paid. As a result, the internet is awash with sites and Medium posts dedicated to teaching data science topics, many of which are of questionable value.

This post includes a list of resources which could help start you on the journey to being a data scientist, but focus on data analysis. This means there is little to no machine learning mentioned here, but there is a lot of focus on statistical analysis of data.

Credentials

I’m a data scientist with a maths PhD and was a quantitative analyst before that. I work in the energy industry and spend a lot of time working with generalized additive models for time series forecasting, chucking stuff at random forests, doing Bayesian inference with pymc3, and survival analysis with lifelines. I don’t use a lot of Tensorflow or PyTorch because they tend not to fit the domain of my problems well, but I revisit them every few months to pit them against our existing models.

Disclaimer

This post is purely my opinion, and in particular reflects my view that people too quickly jump to ML/DL methods when ‘traditional’ methods could do better. Obviously this is very domain-specific—you’d struggle to generate meaningful text with a linear regression.

Two final points before diving in:

  • There is a lot of content between the sources below; don’t feel you have to read and understand them all by any stretch, but don’t expect to be on top of this stuff in a week or a month. Three months is probably the minimum amount of time required to get a feel for this, and more like a year to be useful to a third party

  • Domain knowledge is super important; if you are interested in a particular industry, read up on that too to make yourself saleable

Learning Resources

Python Basics

Nothing here is specific to data analysis, so take a look at the r/learnpython FAQ.

In general, good data science often looks to the outside observer like software engineering. It’s not enough to build something in a Jupyter notebook and be done (many claim success in “productionising” notebooks, and all are wrong); so you also need to learn about:

  • Version control (git is the de facto standard, and if you understand that you’ll be able to pick another VCS easily enough. Note that IDEs such as PyCharm give a friendly interface to many commands, but you still have to know the basics.)

  • Packaging

  • Unit testing (I like pytest)

Data Analysis

There’s no getting away from the fact that mathematics is at the core of data analysis, but you don’t have to be John Conway to be useful. In addition, statistics is by far the most important at this level and you don’t need to understand the minutiae of the subject (which is based in measure theory and is tough). Unfortunately I’ve never found a good introduction to statistics with Python (there are plenty for R!), so you have to dip into a number of different resources.

All of Statistics (PDF available here)

Perhaps not all, but Larry Wasserman has written a very approachable introduction to statistics here. The link includes the few data sources given in the book, but it’s very much a textbook. At 500 pages it’s a bit daunting, so I recommend focusing on chapters 1–11 first, then the chapters on linear regression and multivariate models, which is about 200 pages total. Read along with the SciPy docs; in addition take a look at pythonfordatascience.org which calls out useful functions in SciPy and statsmodels.

OpenIntro Statistics

An alternative (and possibly a better alternative) to AoS, this textbook is available with an optional contribution, and used by a number of colleges in the U.S. I’ve not read it, but a closer look, it appears to be pretty great. As with AoS you’ll have to read along with the SciPy and statsmodels docs.

Linear Algebra Done Right

Currently available for free from Springer, this covers a lot of ground in ~300 pages. Less immediately applicable than the stats books, but definitely worth keeping for the future

Python Data Science Handbook

Jake VanderPlas is the author of the excellent altair plotting library and a pretty bright chap. This book serves as a good introduction to NumPy, Pandas, Matplotlib and Scikit-Learn, and the link includes its full text as Jupyter Notebooks, which is awesome. You needn’t bother with the Scikit-Learn chapters unless you want to jump ahead.

Python for Data Analysis and the pandas docs

Which of these you prefer is largely a matter of preferring one medium over another, but PfDA’s second edition is already slightly outdated for pandas 1.0.3, though certainly not enough that it’s not a very useful resource.

Data Science from Scratch

Joel Grus’s book kinda does do what I assert isn’t possible—take you from zero to data scientist hero in a relatively short text. The criticism I would level at it is that it (necessarily) doesn’t go into sufficient depth everywhere, but what it does brilliantly is implement most things from scratch (duh!) to give you a good grounding in the basics.

Anatomy of Matplotlib

This is a great video to get a better understanding of how to work with Matplotlib, which is definitely the least Pythonic library still in use by data analysts today. It’s also slightly outdated, but hugely valuable.

Introduction to Survival Analysis — lifelines docs

Great introduction to survival analysis, which will either help you look like a superstar or be completely irrelevant.

Winning with simple, even linear models

I was at this talk at PyData London a few years ago and it was the best of the conference in my opinion. Vincent makes the argument that people are too quick to leap to ML/DL methods when simpler models could do as well or if not better.

The Visual Display of Quantitative Information

If you buy one book on visualisation, it should be this. (If you buy two, it should be this an The Grammar of Graphics)

Data Science

Briefly, here’re a few resources that cover data science proper, but don’t expect to get here any time soon!

  • r/datascience (includes all the other resources in this section)

  • The Elements of Statistical Learning and An Introduction to Statistical Learning (the former goes into more detail on the maths than the latter)

  • Pattern Recognition and Machine Learning

  • Andrew Ng’s Machine Learning course

Data Sources

As mentioned before, if you’re interested in a particular industry then see if you can get data related to it. Otherwise, these are some general sources of good-quality data.

  • Scikit-Learn data has some really good ‘toy’ datasets that are useful for playing around with descriptive and inferential statistics, besides the skl estimators

  • data.gov.uk and data.gov have hundreds of thousands of data sets. Many of these offer a great opportunity to practice cleaning up data with pandas because they come in all shapes and sizes

  • OpenIntro Statistics data sets used in this textbook

Out-of-scope

The following topics haven’t been mentioned in this post yet, because I consider them adjuncts to the main theme, but will probably be of importance:

  • SQL (probably very important!)

  • Big data (possibly less so, but in general the problems of big data are about finding efficient ways of doing the same stuff with… big data) inc. e.g. PySpark etc.

  • Continuous integration/continuous delivery

  • Docker/Kubernetes

Postscript

The original version of this post appeared ~3 weeks ago and the number of links in it got it marked as spam and it was deleted by the mods; thanks to u/novel_yet_trivial for sorting it out!

🌐
Reddit
reddit.com › r/analytics › how python is used for data analysis ?
r/analytics on Reddit: How Python is used for Data analysis ?
November 17, 2021 -

What do you need learn from Python language to perform Data analysis ? (i.e. How to begin ? As you all know Python is vast)

I'm currently in a huge Rabbit hole because I don't understand where to begin.

Any tutorial recommendations could be really helpful to start somewhere.