Craftsman’s Approach to Tool Selection / Deep Work

I recently finished the book:

Newport, Cal. 2016. Deep work. Find in a Library

There’s a lot I could say about this book and how it has potential to help people develop, but I think my coworker has already done a good job of explaining how that can work.

I’d like to use this blog post to highlight a domain-neutral framework for choosing one’s tools that Cal Newport offers on page 191 of Deep Work. He calls it “The craftsman’s approach to choosing tools”:

Identify the core factors that determine success and happiness in your professional and personal life. Adopt a tool only if its positive effects on those factors substantially outweigh its negative impacts.

While I’m still at work identifying all of those core factors, I realized that two of them were: 1) Staying present to my family and friends when I’m with them and 2) Staying on task when I start something.

I starting thinking about the times where I’ve been at gatherings and I’ve gotten bored and started browsing Facebook in a corner instead of listening to what’s being said. I’ve also thought about many times when I’ve started to use my phone to look up a specific thing, use it as a calculator or message someone – only to notice unread notifications on Facebook or Twitter. I look those up and forget what I was going to do.

There’s been usefulness to having Facebook and Twitter on my phone – especially when I want to post a quick picture or see what a particular person has been up to. But using the Craftsman’s approach above, I realized that this utility was being swamped by the negative effects of social withdrawal and distraction when trying to use my phone as a tool. So off my phone they went. I still have the accounts for now, though I really need to do more of an analysis on Twitter.

I don’t see the Craftsman’s approach as a Luddite one. Any tool whose pluses outweigh their minuses in terms of contributing to your goals ought to be adopted. But I find it a welcome corrective to the idea if we find any usefulness in a new thing at all, we’re committed to using it.

Posted in me, Uncategorized | Leave a comment

Python and Data Science: Update

It’s been a month since I finished my edX class and I wanted to provide a quick(ish) update. Since I completed the course I’ve:

  1. Installed the Anaconda python package on my laptop. It includes a very nice editor called Spyder
  2. Worked at cleaning up my data spreadsheet of library data
  3. Realized that rather than hard coding specific columns to analyze, I ultimately want to be able to have a menu driven program where a user could ask to data items of their choice analyzed

The last item means I have more work to do with loops and working with headers from my spreadsheet. To get this work done, I am tacking back to learning more Python programming, this time from a book:

Matthes, Eric. 2016. Python crash course: a hands-on, project-based introduction to programming. Find in a Library

This book is in two parts. Part 1 is a general overview of Python, clearly explained with lots of hands on examples. Part 2 is devoted to using knowledge gained in part 1 in three different programming projects. Fortunately for me, the second project is all about data visualization, which is my primary interest in learning Python to begin with.

I’ve only been working with this book for a few days and I’ve worked through the first three chapters. Actually chapters two and three because chapter one was a step by step to getting your own programming environment which I already had. Chapters two and three were mostly review for me, but showed me a few new things about print() and working with lists. The hands-on examples gave me good practice. The author is really encouraging about getting you to play with your code — something missing from the edX course, useful as it was.

I’m looking forward to working on chapter four because it will cover using loops with lists. While I’ll be ultimately working with arrays and panda data structures, I expect some of this material to be relevant.

While the other Part 2 projects look fun, I will likely only do the data visualization project at this time.

I’ll try to write more once I get into the data visualization project in this book. I just wanted to let you know that despite a pretty busy life and the fluid political situation in this country, I’m still working on lifelong learning and hope you’ll find some time as well.

 

Posted in lifelong learning | Tagged | Leave a comment

What Do I Know? (External Blog): Data Journalism In The Alternative Fact Era

Data Journalism seems to be a hot topic these days as large data bases are increasingly becoming available.  For me the issue is figuring out how to download them, clean them up, and then play with them to find interesting patterns.   That’s what I’m hoping to get out of the class.

Source: What Do I Know?: Data Journalism In The Alternative Fact Era

 

Quick post from another Alaskan who is in a different course on data science, this one seeming to be specific to journalism. His full post gives some examples of how data was used in his city’s government and a speculation of what data journalism might be able to accomplish.

As far as my own efforts, I need to put a blog post together. The short version is I found a development environment, had a false start due to space names in rows and now I have working (if very simple) code. More later.

Posted in Uncategorized | Leave a comment

Review: Intergalactic Empires ‹ Planetary Defense Command ‹ Reader — WordPress.com

Link: Review: Intergalactic Empires ‹ Planetary Defense Command ‹ Reader — WordPress.com

If your duties cover readers advisory, or you just like science fiction, Planetary Defense Command is a good blog to follow. The post I am linking to above is a set of short reviews of vintage science fiction books involving galactic empires. The post also links to what I think is a good series on using a galactic empire as a story setting.

The blogger at Planetary Defense Command seems pretty knowledgeable about science fiction. Give him a try.

Posted in Book Reviews | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science – COMPLETED

Today I completed the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science. My last push included Module 6, the intriguingly titled “Control flow and Pandas” followed by an accidental completion of the final exam. More about that later. I finished the audit track of the course with 91%. I could have purchased a Verified Certificate from EdX for $49.00, but I haven’t seen the EdX certificate used much in the job market. I know what I learned and that’s enough for now.

Enough of housekeeping and celebration. Let’s talk about Module 6, “Control flow and Pandas” Control flow was a presentation on Booleans (which most librarians really ought to know in their sleep) and if-then statements. The syntax for if then is a bit different in Python than in other programming languages that I’ve been exposed to, but still familiar.

 

Pandas turns out to stand for Python Data Analysis. It is a set of tools that allow you to work on arrays that have different data types. It lets you import files – either on your computer or by URL. Once important, you can specific rows and columns by name. It seemed very handy, though I need to read up on it some more and especially practice using it.

After I completed Module 6, I thought I’d take a look at what the final exam looked like. I might be wrong, but when I entered the final exam I seemed to be put into a “this is your one chance to take the final – you have four hours total to complete the problems” page. I didn’t immediate see a graceful way out, so I basically took the final cold.  It was a mix of question types and you had four minutes to answer a given question before you moved on to the next. A couple stumped me outright, but I was able to provide some kind of answer for all but a few and I wound up with an 82% on the final. Not fantastic, but pretty good for taking it cold. Averaged with my quizzes and practice labs I wound up with 91%, so I’m not looking for a retake.

While the course did try to sell you a $29/mo subscription to Datacamp at the end of Every. Single. Exercise, I still recommend this course to others. It’s a great way to get your feet wet both in python and the beginnings of data analysis.

My next step, that I hope to take tomorrow, is to download and install Python, Numpy, Matplotlib and Pandas on my laptop. I’ll also bookmark documentation. Then have a look at a public dataset, probably library related and see what I can do with it.

In the longer term, I should do some reading up on data science. I’ve had a number of reading recommendations from people, so it’s just a matter of picking something up and working through it.

Posted in lifelong learning | Tagged , | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science: Module 5 Matplotlib

Today I completed Module 5 of the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science – Matplotlib.

As you might guess from the title of this module, it was all about plotting data. I learned how to do line plots, scatter plots and how to graph three variables at once. The material looked at population trends and the correlation between life expectancy and per capita GDP.

The lessons covered different types of plots, how to customize the axes and some material on histograms, which was presented as a good early step in customizing your data. The commands were basic, but powerful. Again, Excel has similar functionality but I can see where it might be easier to customize a plot in Python. To get a sense of the full range of plots that matplotlib can do, check out their gallery.

One key piece I don’t yet have is how to access external files. I assume I can figure this out from the documentation if it isn’t presented as a lesson, but until I have that piece, it will be hard to do analysis on the public library stats and other data sets I might be interested in. But I’m intrigued to try.

I found this module to be easier than the previous two. Probably because I’m decent in producing charts in Excel and am familiar with the basics of plotting.

I also had another reminder that I need more education in data in addition to programming tools. There was one exercise where we were instructed to plot one variable on a logarithmic scale to make the trend show up better. It did, but I’m not sure why or under what conditions you’d want to use a  logarithmic scale.

The final module has an intriguing title: Control Flow and Pandas. I’m guessing it’s not about pandas munching sticks of data.

 

Posted in lifelong learning | Tagged , | Leave a comment

Microsoft: DAT208x Introduction to Python for Data Science: Module 4 Numpy

Today I completed Module 4 of the self paced EdX class Microsoft: DAT208x Introduction to Python for Data Science – Numpy.

Just a quick moment of celebration: Yay! I’m two-thirds of the way through a programming class in a topic that might have made me run away screaming in college.

The material is getting harder and denser, as a class probably should that is teaching a lot of new material.

Numpy is short for numeric Python. It seems to be pronounced “numb pie” instead of “numb P” which makes me thing of paraphrasing the Gumby theme song to “He can analyze any data set .. Num-py!”

But I digress. This is the module where I really started to see the power of Python and realize I may need to study some aspects of statistics more.

Numpy’s main strengths in my view are 1) the ability to work on entire tables of data at once with no need for loop code and its built-in package of statistical functions and relative easy subsetting of arrays.

The MS course also started to go into a few data analysis techniques apart from programming. Two examples:

  1. When you first get your data, it is very helpful to print the mean and median of each of the variables in your data. If the mean and median are far apart, and especially if the mean is an unrealistic value (say 2000 inches for human height) it may represent a flaw in data gathering and/or retrieval.
  2. It offered some tips on testing a guess/hypothesis, working through an example of whether soccer goal keepers were generally taller than others. Also offered and example of seeing whether their was a correlation between height and weight.

I also learned how to generate simulated data by passing parameters to a randomizing function.

At this point, I think a number of things you can do with data in Python are similar to what can be done in Excel. But I get the sense that Python will handle much larger datasets than Excel can. It may also be easier to compactly report the results. Also an examination of documentation at www.numby.org may yield functionality not available in Excel.

I haven’t yet established a home Python environment, but this lesson gave me inducement to do so. I have a few datasets I’d like to play with. Though at this point we haven’t covered importing data files into the Python environment.

Next module, likely done Sunday or Monday, will be on plotting data. Something I’m very much looking forward to.

Posted in lifelong learning | Tagged | Leave a comment