Keeping data and code together with org-mode

John

from John D. Cook on 2022-08-15 13:24 (#62H5V)

With org-mode you can keep data, code, and documentation in one file.

Suppose you have an org-mode file containing the following table.

 #+NAME: mydata | Drug | Patients | |------+----------| | X | 232 | | Y | 351 | | Z | 117 |

Note that there cannot be a blank line between the NAME header and the beginning of the table.

You can bring this table into Python simply by declaring it to be a variable in the header of a Python code block.

 #+begin_src python :var tbl=mydata :results output print(tbl) #+end_src

When you evaluate this block, you see that the table is imported as a list of lists.

 [['X', 232], ['Y', 351], ['Z', 117]]

Note that the column headings were not imported into Python. Now suppose you would like to retain the headers, and use them as column names in a pandas data frame.

 #+begin_src python :var tbl=mydata :colnames no :results output import pandas as pd df = pd.DataFrame(tbl[1:], columns=tbl[0]) print(df, "\n") print(df["Patients"].mean()) #+end_src

When evaluated, this block produces the following.

 Drug Patients 0 X 232 1 Y 351 2 Z 117 233.33333333333334

Note that in order to import the column names, we told org-mode that there are no column names! We did this with the header option

 :colnames no

This seems backward, but it makes sense. It says do bring in the first row of the table, even though it appears to be a column header that isn't imported by default. But then we tell pandas that we want to make a data frame out of all but the first row (i.e. tbl[1:]) and we want to use the first row (i.e. tbl[0]) as the column names.

A possible disadvantage to keeping data and code together is that the data could be large. But since org files are naturally in outline mode, you could collapse the part of the outline containing the data so that you don't have to look at it unless you need to.

Keeping data and code together with org-mode first appeared on John D. Cook.

Source	RSS or Atom Feed
Feed Location	http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title	John D. Cook
Feed Link	https://www.johndcook.com/blog