This week I started the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science. I’m taking this in part because for a few years now the library science literature has been increasingly insistent that data science is something librarians really need to know more about. Additionally, there are many, many datasets publicly available and I’d like to know how far I can get on my own to analyze and visualize ones I’m interested in.
To help cement my learning, I’ll be blogging each module here. I should also mention here that I have a smattering of several programming languages — usually enough to recognize a language and sometimes enough to either write my own programs or to modify the work of others to get what I want. I would not currently describe myself as fluent in anything. But my previous experience may help me to assimilate the material in this course than someone who hasn’t been exposed to programming languages at all.
Module 1: Basic Python
This was a mix of very short video lectures and serious handholding programming exercises in a lab environment at datacamp.com. I found them effective for what they presented. One significant problem for someone new to online learning is that everytime you finish a lab at datacamp, you are presented with a dialog box that invites you to “upgrade to continue” and offers you a $29.99 pass to all datacamp courses. This is NOT needed for the EdX course, but not everyone may scroll down the dialog box to click on the “continue learning” button that will not charge them and allow them to get back to the EdX interface.
I learned a few things that I *think* are specific to Python:
- You can multiply strings! “Hey “*2 = Hey Hey
- You can get the type of a variable by using the command type(variable name) – This was useful in the exercises since Python’s ways of defining variables seem less formal to me than other languages.
This module also pointed us to the main python documentation at www.python.org and showed us where we can download Python for our own computers. I’m holding off on downloading python for the next few modules. I want to see it put to some practical use before I download another programming environment to my computer.
The next module – Python Lists, ought to have more substantial learning opportunities for me. I’ll report on that after I take it.
PS – I’m taking this class fully on my own time because data science is not in my current set of job duties. While there may be insights we can get from analyzing and visualizing Alaska Public Library Statistics, there is simply too much work in other areas to justify taking this course on work time.