## Creating Your First PDF with LaTeX and Atom

This tutorial will walk you through the steps of creating your first PDF with LaTeX and Atom. This guide focuses on installing LaTeX and Atom on a Mac, but since Atom is a cross-platform editor, most of the instructions should work on Windows and Linux as well. You will need about an hour to download everything and to produce your first PDF.

## Install MacTex

Download MacTeX. MacTeX installs everything you need to compile tex files into PDFs. This will take a while, so grab a coffee.

## Install Atom

If you haven’t already, download the awesome Atom text editor Atom text editor. Atom is awesome because it is open source and supported by GitHub.

On my MacBook Pro running Yosemite, I clicked on the “Download For Mac” buttun, then openned the downloaded atom-mac.zip. In Finder, just drag “Atom” to your Applications folder. You can then find Atom in your Applications folder or launch it from Spotlight. The first time you open Atom, press the “Open” button to trust Atom if prompted.

## Install Skim (for previewing PDFs)

LatexTools makes use of Skim for previewing works-in-progress. Download and install Skim. On OS X Yosemite, I installed version 1.4.17.

To make Skim trusted so that the preview will work, open Skim by holding down the control key while clicking on the Skim icon in the “Applications” folder in Finder. Click “Open” at the prompt.

## Install LatexTools

Open the “Settings” tab by pressing Command+ or using the menu “Atom > Preferences…”.

Click on the “Install” tab on the left. Type in language-latex and click the “Install” button in the language-latex package box. I installed version 0.6.1. This package provides syntax highlighting that will make working with TeX much more enjoyable.

Next Type in latextools and install the latextools package.

## Create a tex source file

Create a new file if you don’t already have one up (you should see a tab titled “untitled” if you already have a new file open). To create a new file go to “File > New File” in the menu or use the keyboard shortcut Command+N.

In the new file paste the following TeX sample:

\documentclass{article}
\title{Title}
\begin{document}
\maketitle{}
\section{Introduction}
This is where you will write your content.
\end{document}


Save this file as sample.tex. You should now see that the content is now recognized by the syntax highlighter (see all the pretty colors?).

## Build and view your PDF

To build this PDF, use the following keyboard shortcut: Command+Alt+B (i.e., all three of those keys at the same time). If that doesn’t work, check your keybindings in the “Settings” tab, in the “Keybindings” tab on the left. Type in latextools:build to see what the command for your system is. On a Mac (i.e., “Darwin”) the keybinding should read alt-cmd-b, for Windowss and Linux the default is probably ctrl-alt-b.

## Conclusion

Hopefully now you have your first PDF ready to show off to all your neighbors. If not, let me know in the comments below so I can update the tutorial.

## R for Impact Evaluation: R and Stata Side-by-side

This tutorial follows the Handbook on Impact Evaluation: Quantitative Methods and Practices, chapter 11. The data files we will use can be downloaded from here. The first part of Chapter 11 is covered in Impact Evaluation on a Budget: World Bank Data and R.

# Notes on Commands

• Stata commands are typed in lowercase, R commands are functions (e.g., ls())
• In Stata, you can type abbreviated forms of functions and variables provided there is no ambiguity. In R, you must use the full function or variable name.
• In Stata, use the Page-Up and Page-Down keys to cycle through previously entered commands. In R, use the Up and Down Arrow keys to do this.

# Working with Data Files: Looking at the Content

## Open the Dataset

Here I assume you saved the file (from the previous tutorial) to the ~/eval/data folder. Stata:

use ~/eval/data/hh_98.dta

R:

library(foreign) hh_98 = read.dta('~/eval/data/hh_98.dta')

(If you don’t already have the foreign library installed, you can use the command install.packages("foreign").)

## Listing the Variables

Stata:

describe

R:

ls(hh_98) dim(hh_98) sapply(hh_98,class)

The function ls(x) displays the names of the objects within x. If you just enter ls(), R will show you the names of the objects open in your current environment (remember you can use ?ls to see the R documentation for the ls() function). The function dim(x) returns the dimensions of object x. When measuring a data.frame, like hh_98, dim() returns the number of rows first followed by the number of columns. The function sapply(x,FUN) returns a simplified result from applying the function FUN to each object in x. The function class(x) returns the class of object x.

## Wildcards and Abbreviations

Stata:

describe exp∗

R:

summary(hh_98[grep("exp", colnames(hh_98))])

In R, it is possible to do things even if we don’t know the exact name of the object we want to analyze. Starting from the innermost function and working our way out, colnames(hh_98) returns a vector where each element is the name of a column of hh_98. grep("exp", x) returns the indices of the elements that contain “exp” (you can also use regexp here) within x. Placing the resulting vector of indices into hh_98[] returns the matching columns. Finally, summary() returns the following summary of the returned columns:

 expfd expnfd exptot Min. : 945.3 Min. : 89.55 Min. : 1193 1st Qu.: 2602.1 1st Qu.: 514.37 1st Qu.: 3254 Median : 3373.7 Median : 865.31 Median : 4432 Mean : 3660.2 Mean : 1813.08 Mean : 5473 3rd Qu.: 4232.5 3rd Qu.: 1710.24 3rd Qu.: 6039 Max. :15270.7 Max. :43411.15 Max. :47981

## Listing Data

List the first three entries in hh_98: Stata:

list in 1/3

R:

hh_98[1:3,]

In R, you can access records in a data.frame using matrix notation. The colon (:) separates the beginning and ending of a sequence. By leaving the portion following the comma blank, we tell R to show all columns. List household size and head’s education for households headed by a female who is younger than 45: Stata:

list famsize educhead if (sexhead==0 &amp; agehead&lt;45)

R:

subset(hh_98,sexhead==0 &amp; agehead&lt;45,c(famsize,educhead))

The subset() function is another method of selecting elements. Here’s the matrix form of the same subset: R:

hh_98[hh_98$sexhead==0 &amp; hh_98$agehead&lt;45,c("famsize","educhead")]

Browse or Edit the data: Stata:

browse edit

R:

View(hh_98) edit(hh_98)

## Summarizing Data

Display summary statistics for a few variables: Stata:

sum famsize educhead sum famsize educhead, d

R:

summary(hh_98[,c("famsize","educhead")]) library(psych) describe(hh_98[,c("famsize","educhead")])

(If you don’t already have the foreign library installed, you can use the command install.packages("foreign").) Using survey weights: Stata:

sum famsize educhead [aw=weight]

R:

library(survey) design &lt;- svydesign(id=~nh,weights=~weight,data=hh_98) svymean(~famsize + educhead,design)

(If you don’t already have the survey library installed, you can use the command install.packages("survey").) Summarize by groups: Stata:

sort dfmfd by dfmfd: sum famsize educhead [aw=weight] tabstat famsize educhead, statistics(mean sd) by(dfmfd)

R:

library(survey) svyby(~famsize + educhead, ~dfmfd, design, svymean)

(you only need to call library(survey) once per session).

## Frequency Distributions (Tabulations)

Stata:

tab dfmfd 

R:

table(hh_98$dfmfd) In R, the table() function presents a table similar to the tabulate function in Stata, but only shows the counts grouped by factor. To see both the counts and percentages, as in the Stata program, we can divide by the total count (i.e., the length()). I group the counts and percentages using a list() so they are displayed together. R: list(count=table(hh_98$dfmfd),percent=table(hh_98$dfmfd)/length(hh_98$dfmfd))

Frequency tables over subsets and for multiple variables: Stata:

tab sexhead if dfmfd==1 tab educhead sexhead

R:

table(hh_98[hh_98$dfmfd==1,]$sexhead) table(hh_98$educhead, hh_98$sexhead)

Column and row percentages: Stata:

tab dfmfd sexhead, col row

R:

mytable &lt;- table(hh_98$dfmfd, hh_98$sexhead) list(counts = mytable, percent.row = prop.table(mytable,1), percent.col = prop.table(mytable,2), count.row = margin.table(mytable,1), count.col = margin.table(mytable,2))

## Distributions of Table Statistics

Stata:

table dfmfd, c(mean famsize mean educhead)

R:

-or-

## The Economist Illustrated: Kazakhstan

 Illustrated by: Joel Hopler The Inspiration Kazakhstan’s capital: Laying the golden egg Illustrator’s Notes I initially thought it would be ironic to show a golden egg of happiness being held up by a beautiful piece of architecture to symbolize how the president of Kazakhstan is hoarding the people’s happiness. Then I re-read the part of the article that mentions the egg and realized that imagery literally exist through the Bayterek tower. I chose to create an image that more explicitly show’s a powerful fist holding the egg away from the tent of nomads. Filed Under: Illustrations Leave a Comment

## More Google Charts with a CSV: Bubble Charts

 Last time we built an interactive scatter plot. This time we’re going to turn that scatter plot into a bubble chart (see a preview of the finished product here). Start by openning up the HTML document we created last time. You can see the source here or expand the section below: <!DOCTYPE html> <html> <head> <title>Google Chart Example</title> <script src="https://www.google.com/jsapi"></script> <script src="http://code.jquery.com/jquery-1.10.1.min.js"></script> <script src="jquery.csv-0.71.js"></script> <script> // load the visualization library from Google and set a listener google.load("visualization", "1", {packages:["corechart"]}); google.setOnLoadCallback(drawChart); function drawChart() { // grab the CSV $.get("kzn1993.csv", function(csvString) { // transform the CSV string into a 2-dimensional array var arrayData =$.csv.toArrays(csvString, {onParseValue: $.csv.hooks.castToScalar}); // use arrayData to load the select elements with the appropriate options for (var i = 0; i < arrayData[0].length; i++) { // this adds the given option to both select elements$("select").append("<option value='" + i + "'>" + arrayData[0][i] + "</option"); } // set the default selection $("#domain option[value='0']").attr("selected","selected");$("#range option[value='1']").attr("selected","selected"); // this new DataTable object holds all the data var data = new google.visualization.arrayToDataTable(arrayData); // this view can select a subset of the data at a time var view = new google.visualization.DataView(data); view.setColumns([0,1]); var options = { title: "KwaZulu-Natal Household Survey (1993)", hAxis: {title: data.getColumnLabel(0), minValue: data.getColumnRange(0).min, maxValue: data.getColumnRange(0).max}, vAxis: {title: data.getColumnLabel(1), minValue: data.getColumnRange(1).min, maxValue: data.getColumnRange(1).max}, legend: 'none' }; var chart = new google.visualization.ScatterChart(document.getElementById('chart')); chart.draw(view, options); // set listener for the update button $("select").change(function(){ // determine selected domain and range var domain = +$("#domain option:selected").val(); var range = +$("#range option:selected").val(); // update the view view.setColumns([domain,range]); // update the options options.hAxis.title = data.getColumnLabel(domain); options.hAxis.minValue = data.getColumnRange(domain).min; options.hAxis.maxValue = data.getColumnRange(domain).max; options.vAxis.title = data.getColumnLabel(range); options.vAxis.minValue = data.getColumnRange(range).min; options.vAxis.maxValue = data.getColumnRange(range).max; // update the chart chart.draw(view, options); }); }); } </script> </head> <body> <div id="chart" style="width:800px; height:500px;"> </div> <select id="range"></select> <select id="domain"></select> </body> </html> Add Controls for Size and Color Bubble charts add two dimension, size and color, to the standard scatter plot (here I’m using Google’s terminology, several other graphics libraries simply add this functionality to their scatter plot functions). To keep the nice interactivity we built into our last chart, let’s start by adding controls for the color and size. We’ll nest everything in an unordered list and add labels to the controls. Just change the section with two <select> tags to match the following: <ul> <li> Y-Axis <select id="range"></select> </li> <li> X-Axis <select id="domain"></select> </li> <li> Color <select id="color"></select> </li> <li> Size <select id="size"></select> </li> </ul> Next we want to get rid of the bullets in our unordered list. Add the following <style> tag inside your <head> tag. <style> ul {list-style-type: none; } </style> Changing the Chart Type Change the line that loads the chart object from this: var chart = new google.visualization.ScatterChart(document.getElementById('chart')); to this: var chart = new google.visualization.BubbleChart(document.getElementById('chart')); Feeding the Data to the Chart The data table for Google’s bubble chart requires the first coloumn to be a string which can be used to identify the bubbles. When we loaded the CSV into an array in the last tutorial, we parsed all values as scalars. We need to update our DataView call to change the values in the first column, the household ids (hhid), to string. This requires us to add a function to retrieve these strings from the DataTable. var view = new google.visualization.DataView(data); view.setColumns([{calc:stringID, type: "string"},1,2,3]); // this function returns the first column values as strings (by row) function stringID(dataTable, rowNum){ return dataTable.getValue(rowNum, 0).toString(); } Updating the Chart Now we need to modify the code that updates the chart when a user changes the selected variables. First we’ll add local variables for color and size to the <select> listener function. These variables need to be assigned the value of the respective <select> tag. After we have column indices for color and size, we will set these as the third and fourth columns (after the id column) in our bubble chart view. See the highlighted lines below:$("select").change(function(){ // determine selected domain and range var domain = +$("#domain option:selected").val(); var range = +$("#range option:selected").val(); var color = +$("#color option:selected").val(); var size = +$("#size option:selected").val(); // update the view view.setColumns([{calc:stringID, type: "string"},domain,range,color,size]); // update the options options.hAxis.title = data.getColumnLabel(domain); options.hAxis.minValue = data.getColumnRange(domain).min; options.hAxis.maxValue = data.getColumnRange(domain).max; options.vAxis.title = data.getColumnLabel(range); options.vAxis.minValue = data.getColumnRange(range).min; options.vAxis.maxValue = data.getColumnRange(range).max; // update the chart chart.draw(view, options); }); Unfortunately, when I test this and select a few variables of interest I get the following chart. This is not very useful. The id values obscure all the information. Improving Upon the Defaults Removing the Bubble Label The bubble labels would work fine if we had only a few data points and being able to quickly identify them was important. In this case, we are more interested in the general relationships between the variables and not the specific position of any one household. Let’s start by removing the bubble label. Go to our stringID function and return an empty string instead of the household id (be sure to comment out the old return statement): function stringID(dataTable, rowNum){ // return dataTable.getValue(rowNum, 0).toString(); // return an empty string instead to avoid the bubble labels return ""; } Now let’s check our chart: Removing the Bubble Border and Adjusting Bubble Opacity Okay. This is a lot nicer, but we can do better by removing the bubble borders and lowering the bubble opacity, since both cause issues with occlusion (i.e., there is data we are not seeing due to overly opaque data in the foreground). To remove the bubble’s border we’ll set it’s stroke color to “transparent”. Let’s change the opacity from the default of 0.8 to 0.2. To implement this we need to add an element to our initial options object var options = { title: "KwaZulu-Natal Household Survey (1993)", hAxis: {title: data.getColumnLabel(0), minValue: data.getColumnRange(0).min, maxValue: data.getColumnRange(0).max}, vAxis: {title: data.getColumnLabel(1), minValue: data.getColumnRange(1).min, maxValue: data.getColumnRange(1).max}, bubble: {stroke: "transparent", opacity: 0.2}, }; and reset it in our <select> listener function: // update the options options.hAxis.title = data.getColumnLabel(domain); options.hAxis.minValue = data.getColumnRange(domain).min; options.hAxis.maxValue = data.getColumnRange(domain).max; options.vAxis.title = data.getColumnLabel(range); options.vAxis.minValue = data.getColumnRange(range).min; options.vAxis.maxValue = data.getColumnRange(range).max; options.bubble = {stroke: "transparent", opacity: 0.2}; Let’s take a look: Changing the Color Gradient This is starting to look great. One issue I have with the default color choice, besides being ugly, is that gray with an opacity of 0.2 is hard to see. Let’s make the color gradient change from red to blue. We do this again by adding an element to the initial options object var options = { title: "KwaZulu-Natal Household Survey (1993)", hAxis: {title: data.getColumnLabel(0), minValue: data.getColumnRange(0).min, maxValue: data.getColumnRange(0).max}, vAxis: {title: data.getColumnLabel(1), minValue: data.getColumnRange(1).min, maxValue: data.getColumnRange(1).max}, bubble: {stroke: "transparent", opacity: 0.2}, colorAxis: {colors:['red','blue']}, }; and resetting it in our <select> listener function: // update the options options.hAxis.title = data.getColumnLabel(domain); options.hAxis.minValue = data.getColumnRange(domain).min; options.hAxis.maxValue = data.getColumnRange(domain).max; options.vAxis.title = data.getColumnLabel(range); options.vAxis.minValue = data.getColumnRange(range).min; options.vAxis.maxValue = data.getColumnRange(range).max; options.bubble = {stroke: "transparent", opacity: 0.2}; options.colorAxis = {colors:['red','blue']}; Bam! And here’s our finished product: These changes have made it easier to explore the dataset and added a little style in the process. You can find an interactive version here (check the source if you are having problems with your chart). Conclusion While this ramped up the complexity of our figure (compared to the chart from the previous tutorial), being able to change which variables control the color and size of the bubbles will make your data that much more engaging. Take the source, change the reference to your CSV, and remember to download a copy of the jquery-csv script. With just a few steps you can have your own interactive chart to encourage your site’s viewers to explore your data. Check out the next tutorial in this series: Google Charts and CSV Part 3: Side-by-Side Bubble Charts For more information on Google’s bubble chart, check the documentation here. Filed Under: Tutorials Tagged With: JavaScript, Visualizations 2 Comments

## The Economist Illustrated: China

 Illustrated by: Joel Hopler The Inspiration Returning students: Plight of the sea turtles Illustrator’s Notes The article made it clear that the sea turtle concept is no longer working in its intended way, so I thought a skeleton of a turtle would illustrate that well. I pointed the turtle westward and labeled it with it’s old and new names, “hai gui” and “hai dai“. Filed Under: Illustrations Leave a Comment

## Easy Data Visualization with Google Charts and a CSV

 Static figures work fine for a print publication. However, when you want to present your research or collected data online, static is stale and dynamic is alive. Today we’re going to take a CSV and create a simple, but interactive, scatter plot. This tutorial assumes some basic familiarity with HTML and JavaScript. If you don’t currently possess these skills, head on over to Codecademy and follow the Web Fundamentals track and the JavaScript track. Setting Up To begin, we need to make sure we have the CSV we want to load and the JavaScript library jquery-csv in the same folder as our HTML. Preview and Data Here’s the end result of this tutorial: Finished Chart The data I’ll be using is from the three wave KwaZulu-Natal Income Dynamics Study (KIDS). In this example I will be using the first round of the survey (1993). Children are household members listed as younger than 16 and pensioners are defined as males over 65 and females over 60. I use an adult equivalent measure of household income used by Carter and May (1999) and many others in the South African context. The cleaned CSV can be downloaded here. I recommend you download this CSV to work along with this tutorial, but feel free to use your own (just be careful to make the relavent changes to the example code). Add the CSV to the same folder as the HTML we will be creating. jQuery-CSV The jQuery-CSV library allows us to easily take a string of CSV data and transform it into the appropriate format for Google’s visualization library. Download either jquery.csv-0.71.js or jquery.csv-0.71.min.js from that page and add it to the folder where your HTML will go. Accessing the CSV To begin with, create the HTML document, load the Google JS API, jQuery, and the jQuery library, and display the contents of the CSV to confirm the CSV is where it’s supposed to be and that we can access all the JavaScript we need: <!DOCTYPE html> <html> <head> <title>Google Chart Example</title> <script src="https://www.google.com/jsapi"></script> <script src="http://code.jquery.com/jquery-1.10.1.min.js"></script> <script src="jquery.csv-0.71.js"></script> <script> // wait till the DOM is loaded $(function() { // grab the CSV$.get("kzn1993.csv", function(csvString) { // display the contents of the CSV $("#chart").html(csvString); }); }); </script> </head> <body> <div id="chart"> </div> </body> </html> Load your newly created HTML to confirm your code outputs the contents of the CSV. A Simple Scatter Plot Clear the script tag we used to display the CSV; in this section we will focus on the JavaScript necessary to create a scatter plot with our CSV. Start by loading the visualization library and setting a callback function: // load the visualization library from Google and set a listener google.load("visualization", "1", {packages:["corechart"]}); google.setOnLoadCallback(drawChart); Next, we need to create the callback function we referenced in the previous step. We’ll begin by grabbing the CSV as we did previously: function drawChart() { // grab the CSV$.get("kzn1993.csv", function(csvString) { We need to transform the CSV into a format suitable for Google’s visualization library: // transform the CSV string into a 2-dimensional array var arrayData = $.csv.toArrays(csvString, {onParseValue:$.csv.hooks.castToScalar}); Next, we’ll transform this array into a DataTable object: // this new DataTable object holds all the data var data = new google.visualization.arrayToDataTable(arrayData); Since we have more columns of data than are needed for our visualization, let’s create a view on this table of just the first two columns: // this view can select a subset of the data at a time var view = new google.visualization.DataView(data); view.setColumns([0,1]); Now let’s set some basic options for our chart: var options = { title: "KwaZulu-Natal Household Survey (1993)", hAxis: {title: data.getColumnLabel(0), minValue: data.getColumnRange(0).min, maxValue: data.getColumnRange(0).max}, vAxis: {title: data.getColumnLabel(1), minValue: data.getColumnRange(1).min, maxValue: data.getColumnRange(1).max}, legend: 'none' }; Now we need to bind a chart to our <div> and tell the chart to draw the current view with the options we selected: var chart = new google.visualization.ScatterChart(document.getElementById('chart')); chart.draw(view, options); All that’s left for this stage is to close our function blocks: }); } If you load our current progress you should see the following (relatively meaningless) chart: Adding Interaction This chart already features interactivity in the form of rollover states for the plotted points. What we really need is to be able to change the variables we are plotting on the fly. Add the following tags after the </div> tag. I place the range first so that it lines up with the y-axis title: <select id="range"> </select> <select id="domain"> </select> <button type="button">Update Chart</button> Now we need to update our script to first load the <select> tags with the CSV headers, and also to respond to a click on our button. Adding <options> to the <select> elements Immediately following the assignment of arrayData, add the CSV headers to the <select> element: // use arrayData to load the select elements with the appropriate options for (var i = 0; i < arrayData[0].length; i++) { // this adds the given option to both select elements $("select").append("<option value='" + i + "'>" + arrayData[0][i] + "</option"); } Make sure the <select> elements show the starting options: // set the default selection$("#domain option[value='0']").attr("selected","selected"); $("#range option[value='1']").attr("selected","selected"); Updating the Chart Now we need to assign a function to the button we created. Add the following after chart.draw(view, options);: // set listener for the update button$("button").click(function(){ Assign the selected column indices to local variables: // determine selected domain and range var domain = +$("#domain option:selected").val(); var range = +$("#range option:selected").val(); Update the view to reflect the selected columns: // update the view view.setColumns([domain,range]); Update the axis titles and the axis ranges: // update the options options.hAxis.title = data.getColumnLabel(domain); options.hAxis.minValue = data.getColumnRange(domain).min; options.hAxis.maxValue = data.getColumnRange(domain).max; options.vAxis.title = data.getColumnLabel(range); options.vAxis.minValue = data.getColumnRange(range).min; options.vAxis.maxValue = data.getColumnRange(range).max; Update the chart and close the function block: // update the chart chart.draw(view, options); }); Cool! Now we can do more interesting comparisons like plotting cm_16_exp and mean_educ. Here’s what our current chart looks like: Even Better UX UX = User experience. User experience design is an important consideration. We want visitors to our site/blog to enjoy exploring our data. To make our chart more enjoyable, let’s remove the annoying step of having to click the button to update. Simply change this: $("button").click(function(){ to this:$("select").change(function(){ and remove the <button> tag. Now your chart should look like this (view the source and compare to yours if your chart is not working). Conclusion I hope you enjoyed this tutorial, and especially the end project. Now, to use your own CSV, all you need to do is change the file string “kzn1993.csv” to the name of your CSV and change the title in the chart options. In the next tutorial, we’ll use the Google visualization library to make a bubble chart. (Check out the third tutorial in this series: Google Charts and CSV Part 3: Side-by-Side Bubble Charts)As always, place any questions or comments in the section below. Thanks! Filed Under: Tutorials Tagged With: JavaScript, Visualizations 38 Comments

## The Economist Illustrated: China

 Illustrated by: Joel Hopler The Inspiration China’s cash crunch: Bear in the China shop Illustrator’s Notes This article left me with the impression that China has potential to rebalance their economy. While the article largely focuses on bearish Chinese lending, the point is made that the Chinese government has effective controls to bring back the bull. To represent this point I show a tamed bull, drinking tea in a china shop. Filed Under: Illustrations Leave a Comment
 
