Books! January 2016 Edition

Books finished

  • The Philosopher Kings, by Jo Walton. A sequel to The Just City, in which Jo Walton explores a society dedicated to enacting Plato’s Republic. Very good read, and it explores a few interesting varients of the Republic idea, but it spends a bit more time than I’d like following the adventures of specific characters. (This is not normally a complaint I’d have for a novel, but I feel like this series is more about the ideas than the characters.) On the whole not as interesting or good as the first book, but still recommended.

  • Pride and Prejudice, by Jane Austen. I actually avoided being forced to read any Austen in grade school, but I was recently recommended Mary Robinette Kowal’s Shades of Milk and Honey, and reading at least one Austen book first was suggested. I have yet to read the Kowal, but I’m glad I waited until now to read Austen. I would not have appreciated this book in the least as a teenage boy, reading it now was extremely enjoyable. It helps a lot that Leigh is a huge Austen fan, and was able to help me understand some of the cultural context that informs the story. I was also surprised to find the book to be much funnier, and more absurd, than most of the media adaptations.

  • The Checklist Manifesto, by Atul Gawande. This was an excellent read for a book with such a simple message: checklists (or other simple, task-oriented memory aids) make it easier to deal with complexity. Their use appears to improve outcomes across the board. The book covers a variety of fields, citing both peer-reviewed research and anecdotes to make its case, and the author tells the story of introducing checklists into regular practice in surgery.

  • Star Wars: Scoundrels, by Timothy Zahn. Yes, it’s a Star Wars book; sometimes you just need something light and silly to read. However, despite normally being a fan of Zahn’s work, I found this one fairly disappointing. A heist novel starring Han Solo and Lando Calrissian should be breezy and fun, but this honestly dragged on a bit. It was still fun, but I had to work at it.

In progress

Want to talk about any of these books, or to send me a recommendation? Friend me on Goodreads or send me an email. I read a lot, and I love talking about books.

A Few Goals for 2016

I’m not sure I’d like to call them “resolutions”, but here are a few of the things I’m planning to work on in 2016.

  • Improve my skills with Python and Go

I’ve been using Python and Go for most of my projects lately, whether at work or for fun, and I usually feel like I can figure out how to solve my problems in them. But in both langues, I still spend a lot of the time with API docs open in the next window, and I haven’t written anything longer than a few thousand lines of code in either. I’d like to try to tackle a couple of larger projects and get really solid with the standard libraries, so I can spend more time coding and less time looking things up.

  • Get a better understanding of theory

There are several areas where I have a solid “practitioners’ understanding” of what I’m doing, but not a lot of underlying theory. This includes some fields I at least touch on every day as a system administrator, such as monitoring and configuration management, as well as some basic computer science I didn’t get in school since I wasn’t a CS major.

Having come from physics, where I learned a lot more theory before attempting to put anything into practice, this feels pretty backwards. I know that these days, computing is a field where it’s pretty normal to be self-taught and just start working; and large systems administration in particular seems like a field where there is not a well-defined body of theory that everyone reads. Still, I’d like to get myself on a more rigorous footing, especially given the scale and complexity of the types of systems I work on, and those I’d like to be working on.

I have a Goodreads shelf where I’m collecting some of the books in this vein that I’ve read, and will probably re-read this year. It includes some books that aren’t strictly theory, but more like collections of best practices; but these are useful too. Sometime later I may post a list of papers I’m planning to read or re-read this year. Suggestions in this area are welcome. :)

  • Refresh my math

Having been out of physics and materials science for over five years now, I guess it’s normal for me to feel a little rusty on some things. But while I don’t intend to be getting back into active physics research, I’d still like to maintain the skills I spent so much time learning, and still be able to use and understand advanced math when called for. Not to mention refreshing myself on how to “think like a physicist”, which definitely comes in handy from time to time.

In particular, I know that my statistics and linear algebra are rusty, and those are areas that are useful in any technical field, so I’ll probably focus on these. I’ll also try to spend some time refreshing my calculus abilities; while I think I still have a pretty decent grasp in that area, I’m not sure I’d enjoy it if you asked me to integrate by parts…

  • Read more fiction by diverse authors

I’ve seen calls in the past to read only women, or only people of color, for a given year, but I’ve always thought it would be too difficult to restrict my reading habits that closely. I tend to pick up books impulsively, and don’t always notice who wrote a given novel; I don’t love the idea of deciding to not read something exciting because of who wrote it. Not to mention I have several ongoing series I enjoy which are written by straight, white men. And while I realize that part of the challenge is in its difficulty, fiction reading is escapism for me. Reading more books I can do; giving up others is a little too far.

However: Max Gladstone, whose books I enjoy myself, posted about an intermediate project which I am willing to try. While he had different reasons, he eventually came up with:

In the end, I settled on a related project: I wouldn’t read two books by straight white cis men back to back.

I like this because it should increase the diversity of my reading – always a good thing! – while not killing my ability to follow series I enjoy. And who knows, maybe next year I can go all in.

  • Make time for fun

I know in advance that 2016 is going to be very busy at work. Much more than this year, I expect to be spending some nights and weekends at the lab. Knowing that, I’m going to consciously try to make time for doing things completely unrelated to computers: playing board games, going curling, tasting beer, playing with our cats, and spending time with Leigh.

Of couse, we’ll see how it goes. I wouldn’t be surprised if in 2016 I end up doing completely different things. But these are a few of the things I’m planning to think about right now.

Good luck in 2016, everyone. And Happy New Year!

Tooltip: ClusterShell

Whenever sysadmins get together, I’ve noticed we often end up trading interesting tools and tricks that others may not have heard of. There is just so much out there that you can’t know about every available tool, so it’s always fun to hear what others are using. In that spirit, I thought I’d start the occasional “tooltip” blog entry about tools I find useful at work or in playing around. (The name for the series being borrowed both from the little hover text, and from the similar segment on The Ship Show.)

ClusterShell

When you’re helping to manage a large computing system, it’s sometimes unavoidable that you will have to run some ad-hoc command on a large number of nodes. In theory, perhaps, your configuration management software or cluster manager should be the only tool used to manage most systems; in reality, these systems occasionally break, or become too cumbersome to make quick, transient actions. Sometimes you just need to give your system a kick, and the only tool you have left is ssh.

ClusterShell, from the HPC Group at CEA, provides a collection of tools and Python libraries for executing commands across a large number of machines at the same time and gathering the results. You can think of it as a parallel ssh command, and it’s very lightweight in both usability and performance. It works a lot like LLNL’s pdsh which provides similar functionality, but also provides a lot of nice bells and whistles which I think make it a better tool for everyday system admin work.

Read on →

Recommended Links for 2015-11-28

Here are some of the interesting things I’ve read recently on the Web.

Recommended Links for 2015-11-06

It’s been forever since I updated the blog at all, and I do it with a linkdump. :) But it’s also a good chance to roll out a new theme, as I was getting sick of the dark one.

Here are some of the interesting things I’ve been reading lately.

  • Fail at Scale by Ben Maurer at Facebook

    I found this article on failures in distributed systems at Facebook really interesting. It includes information on common observed failure modes, advice for preventing failures from cascading across a large system, improvements in monitoring dashboards, and a description of Facebook’s incident review methodology. There are a lot of really good practical lessons here, and I highly recommend reading this article.

    One small section that stood out to me called out the fact that human-initiated changes are a major source of failure, an insight I’ve seen in a number of places lately:

    These two data points seem to suggest that when Facebook employees are not actively making changes to infrastructure because they are busy with other things (weekends, holidays, or even performance reviews), the site experiences higher levels of reliability. We believe this is not a result of carelessness on the part of people making changes but rather evidence that our infrastructure is largely self-healing in the face of non-human causes of errors such as machine failure.

  • Let’s talk about logging, by Dave Cheney

    I’m actually sharing this link for two reasons: because it’s thought-provoking, and because I disagree with it so much! Cheney makes some interesting arguments for the idea that there are only two types of logs: debug logs that programmers care about, and info logs that users care about. Other log levels are unneeded.

    My disagreement could probably be expanded into a whole blog post. But suffice it to say that I think having levels of logging is extremely useful from an operational perspective and makes monitoring and filtering those logs a lot easier. In my opinion, collapsing these into “debug” and “info” would only result in a lot of custom tags in the text of the logs, and a lot more work for regex parsers in the monitoring system. :)

  • A Case Study - Scaling Legacy Code on Next Generation Platforms, by William Roshan Quadros at SNL for IMR24

    This paper is an interesting little case study about how to scale HPC codes to platforms with new processor technologies and a higher level of parallelism – i.e., the new Trinity system at LANL. I expect to have reason to re-read this a few more times.

  • Disaggregated disk, by Dan Luu

    This is a good read about how the difference in latency between different types of data transfers (disk, network, RDMA, etc) help to determine what kind of system design will perform best. For example, storage can often be located on a different machine rather than locally because network latency is often much less than the latency of a disk seek. Very much worth the read.

  • Complexity budgets, by Jamie Brandon

    A good short post on designing systems to reduce complexity and maximize the ability of a team to understand the system, not just optimizing each individual part of the system in isolation.

  • Blue Monday, by Laurie Penny

    An interesting and disturbing little near-future science fiction story, on Vice’s Motherboard site. (Which site has been impressing me a lot lately.)