Microsoft: DAT208x Introduction to Python for Data Science – Module 1

This week I started the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science. I’m taking this in part because for a few years now the library science literature has been increasingly insistent that data science is something librarians really need to know more about. Additionally, there are many, many datasets publicly available and I’d like to know how far I can get on my own to analyze and visualize ones I’m interested in.

To help cement my learning, I’ll be blogging each module here. I should also mention here that I have a smattering of several programming languages — usually enough to recognize a language and sometimes enough to either write my own programs or to modify the work of others to get what I want. I would not currently describe myself as fluent in anything. But my previous experience may help me to assimilate the material in this course than someone who hasn’t been exposed to programming languages at all.

Module 1: Basic Python

This was a mix of very short video lectures and serious handholding programming exercises in a lab environment at datacamp.com. I found them effective for what they presented. One significant problem for someone new to online learning is that everytime you finish a lab at datacamp, you are presented with a dialog box that invites you to “upgrade to continue” and offers you a $29.99 pass to all datacamp courses. This is NOT needed for the EdX course, but not everyone may scroll down the dialog box to click on the “continue learning” button that will not charge them and allow them to get back to the EdX interface.

What did I learn? As someone who has used C++ and javascript, I mostly learned that there are some programming languages not obsessed with semi-colons. I learned the python specific ways to do common calculations, declare variables and comment in code. I’m glad they’re building in commenting from the beginning as a good practice.

I learned a few things that I *think* are specific to Python:

  • You can multiply strings! “Hey “*2 = Hey Hey
  • You can get the type of a variable by using the command type(variable name) – This was useful in the exercises since Python’s ways of defining variables seem less formal to me than other languages.

This module also pointed us to the main python documentation at www.python.org and showed us where we can download Python for our own computers. I’m holding off on downloading python for the next few modules. I want to see it put to some practical use before I download another programming environment to my computer.

The next module – Python Lists, ought to have more substantial learning opportunities for me. I’ll report on that after I take it.

PS – I’m taking this class fully on my own time because data science is not in my current set of job duties. While there may be insights we can get from analyzing and visualizing Alaska Public Library Statistics, there is simply too much work in other areas to justify taking this course on work time.

Advertisements
This entry was posted in lifelong learning and tagged , . Bookmark the permalink.

One Response to Microsoft: DAT208x Introduction to Python for Data Science – Module 1

  1. digilou says:

    Python and Ruby are both great for letting you get stuff done rather than being obsessed about semicolons and such. Good luck! And please pass along anything to me you feel that we both can use at work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s