Review: Intergalactic Empires ‹ Planetary Defense Command ‹ Reader — WordPress.com

Link: Review: Intergalactic Empires ‹ Planetary Defense Command ‹ Reader — WordPress.com

If your duties cover readers advisory, or you just like science fiction, Planetary Defense Command is a good blog to follow. The post I am linking to above is a set of short reviews of vintage science fiction books involving galactic empires. The post also links to what I think is a good series on using a galactic empire as a story setting.

The blogger at Planetary Defense Command seems pretty knowledgeable about science fiction. Give him a try.

Posted in Book Reviews | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science – COMPLETED

Today I completed the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science. My last push included Module 6, the intriguingly titled “Control flow and Pandas” followed by an accidental completion of the final exam. More about that later. I finished the audit track of the course with 91%. I could have purchased a Verified Certificate from EdX for $49.00, but I haven’t seen the EdX certificate used much in the job market. I know what I learned and that’s enough for now.

Enough of housekeeping and celebration. Let’s talk about Module 6, “Control flow and Pandas” Control flow was a presentation on Booleans (which most librarians really ought to know in their sleep) and if-then statements. The syntax for if then is a bit different in Python than in other programming languages that I’ve been exposed to, but still familiar.

 

Pandas turns out to stand for Python Data Analysis. It is a set of tools that allow you to work on arrays that have different data types. It lets you import files – either on your computer or by URL. Once important, you can specific rows and columns by name. It seemed very handy, though I need to read up on it some more and especially practice using it.

After I completed Module 6, I thought I’d take a look at what the final exam looked like. I might be wrong, but when I entered the final exam I seemed to be put into a “this is your one chance to take the final – you have four hours total to complete the problems” page. I didn’t immediate see a graceful way out, so I basically took the final cold.  It was a mix of question types and you had four minutes to answer a given question before you moved on to the next. A couple stumped me outright, but I was able to provide some kind of answer for all but a few and I wound up with an 82% on the final. Not fantastic, but pretty good for taking it cold. Averaged with my quizzes and practice labs I wound up with 91%, so I’m not looking for a retake.

While the course did try to sell you a $29/mo subscription to Datacamp at the end of Every. Single. Exercise, I still recommend this course to others. It’s a great way to get your feet wet both in python and the beginnings of data analysis.

My next step, that I hope to take tomorrow, is to download and install Python, Numpy, Matplotlib and Pandas on my laptop. I’ll also bookmark documentation. Then have a look at a public dataset, probably library related and see what I can do with it.

In the longer term, I should do some reading up on data science. I’ve had a number of reading recommendations from people, so it’s just a matter of picking something up and working through it.

Posted in lifelong learning | Tagged , | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science: Module 5 Matplotlib

Today I completed Module 5 of the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science – Matplotlib.

As you might guess from the title of this module, it was all about plotting data. I learned how to do line plots, scatter plots and how to graph three variables at once. The material looked at population trends and the correlation between life expectancy and per capita GDP.

The lessons covered different types of plots, how to customize the axes and some material on histograms, which was presented as a good early step in customizing your data. The commands were basic, but powerful. Again, Excel has similar functionality but I can see where it might be easier to customize a plot in Python. To get a sense of the full range of plots that matplotlib can do, check out their gallery.

One key piece I don’t yet have is how to access external files. I assume I can figure this out from the documentation if it isn’t presented as a lesson, but until I have that piece, it will be hard to do analysis on the public library stats and other data sets I might be interested in. But I’m intrigued to try.

I found this module to be easier than the previous two. Probably because I’m decent in producing charts in Excel and am familiar with the basics of plotting.

I also had another reminder that I need more education in data in addition to programming tools. There was one exercise where we were instructed to plot one variable on a logarithmic scale to make the trend show up better. It did, but I’m not sure why or under what conditions you’d want to use a  logarithmic scale.

The final module has an intriguing title: Control Flow and Pandas. I’m guessing it’s not about pandas munching sticks of data.

 

Posted in lifelong learning | Tagged , | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science: Module 4 Numpy

Today I completed Module 4 of the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science – Numpy.

Just a quick moment of celebration: Yay! I’m two-thirds of the way through a programming class in a topic that might have made me run away screaming in college.

The material is getting harder and denser, as a class probably should that is teaching a lot of new material.

Numpy is short for numeric Python. It seems to be pronounced “numb pie” instead of “numb P” which makes me thing of paraphrasing the Gumby theme song to “He can analyze any data set .. Num-py!”

But I digress. This is the module where I really started to see the power of Python and realize I may need to study some aspects of statistics more.

Numpy’s main strengths in my view are 1) the ability to work on entire tables of data at once with no need for loop code and its built-in package of statistical functions and relative easy subsetting of arrays.

The MS course also started to go into a few data analysis techniques apart from programming. Two examples:

  1. When you first get your data, it is very helpful to print the mean and median of each of the variables in your data. If the mean and median are far apart, and especially if the mean is an unrealistic value (say 2000 inches for human height) it may represent a flaw in data gathering and/or retrieval.
  2. It offered some tips on testing a guess/hypothesis, working through an example of whether soccer goal keepers were generally taller than others. Also offered and example of seeing whether their was a correlation between height and weight.

I also learned how to generate simulated data by passing parameters to a randomizing function.

At this point, I think a number of things you can do with data in Python are similar to what can be done in Excel. But I get the sense that Python will handle much larger datasets than Excel can. It may also be easier to compactly report the results. Also an examination of documentation at www.numby.org may yield functionality not available in Excel.

I haven’t yet established a home Python environment, but this lesson gave me inducement to do so. I have a few datasets I’d like to play with. Though at this point we haven’t covered importing data files into the Python environment.

Next module, likely done Sunday or Monday, will be on plotting data. Something I’m very much looking forward to.

Posted in lifelong learning | Tagged | Leave a comment

Social Media Decisions: Bye Bye Personal Blog

In my post 2016 Personal Social Media Inventory, I shared the social media I was currently using. I also shared the outlets I was definitely keeping, the ones I planned to get rid of and several I was still mulling over.

My “personal” blog of Eclectic Alaskan was on my list of maybes. I’ve had it for a long time, but a lot of the things I used to share on my personal blog, I’ve been sharing through Facebook. So I needed to either recommit to the blog or share my personal stuff (photos, politics, etc) through Facebook. After some soul searching, I reluctantly froze my blog, leaving up posts as historical information. I explained my decision in my last blog post at Eclectic Alaskan.

To clarify, THIS blog WILL continue. It is a convenient place to blog my learning activities and opinions about developments in the library and information science field.

Depending on how Facebook develops (degrades/enrages), I reserve the right to revive Eclectic Alaskan.

Posted in social media | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science – Module 3

Today I completed Functions and Packages Module 3 of the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science.

This was almost as much work as module 2 (lists), which is good because it means I’m learning new things.

The short summary of this module is:

Functions – Reusable bits of Python code used a particular task. If you can think of a particular task, there is likely a function for it.

Methods – Subclass of functions tied to a specific type of Python object. Called with a “.” after a variable name. Here’s an example of the difference using the variable “room” with the value “poolhouse”

Function – print(room) – this prints “poolhouse”

Method – room.count(“o”) – This counts the number of times the letter “o” appears in the variable “room” whose value is currently set to “poolhouse”. If we used the command:

print(room.count(“o”)), we would get “3”, the number of times that the letter o appears in poolhouse.

Packages – These are directories of Python functions and methods. Because there are a large number of discipline specific packages for Python, the basic distribution of Python doesn’t have them all. There is a tool called “pip” you can use to download and install packages you need for your work. According to the instructor for this course, three common packages needed for data science are  Numpy, Matplotlib and Scikit-learn. Numpy and Matplotlib have their own separate modules in this class.

Another interesting thing about packages is that installing them into your programming environment isn’t enough. There are one and usually two more things you need to do in your code itself:

  1. Have a line that imports the package (or subpackage, or function)
  2. If you’ve imported the entire package, you’ll need to preface the function with the package name.

So if I want to use the radians function from the math package to determine the number of radians in 12 degrees, I’d need these two lines in my program:

import math

print(math.radians(12))

If you don’t want to put “math.” in front of radians, Python lets you import single functions. So I could execute the radians command as:

from math import radians

print(radians(12))

But doing this method can be confusing to others looking at your code, particularly with longer programs. So I’ll probably import full packages. Not sure what this does to my actual program length. I’ll look into that later.

With this module, I’ve become convinced of the value of downloading Python to my home computer and working on it further. I’ve got other things I need to do today, but will try to start setting up my coding environment this week.

Aside from the constant ads from Datacamp during the lab portion of this course, I’m really liking the course organization of short lectures followed by hands on exercises. It makes me feel like I’m getting stuff done. There should be some way I could start doing training videos in a similar way for database or tech training – though I’m not sure how I’d get the hands-on piece.

Posted in lifelong learning | Tagged | Leave a comment

Social Media Decisions: Bye Instagram (for now)

In my post 2016 Personal Social Media Inventory, I shared the social media I was currently using. I also shared the outlets I was definitely keeping, the ones I planned to get rid of and several I was still mulling over.

Instagram was one of my maybes. I had decided to close this account, because I wasn’t getting much value from it. When I went to close out my account, I saw they had an option to temporally disable the account. It stays disabled (hides everything) until you activate it again.

I decided to hedge my bets by simply disabling my Instagram account. If I decide I really want it after all, I don’t need to recreate everything. If I don’t go back for another six months or a year, I’ll fully delete it.

Posted in social media | Leave a comment