Mind map usage example: John Boyd Papers Index
[by Lynn C. Rees, with points from John Boyd]
Earlier this year, Scott kindly shared a PDF index of the John Boyd Papers (see tail end of PDF here). While PDFs are good for preserving document layout, they’re poor at storing clean text data. Since I wanted the index in a spreadsheet to facilitate searching and sorting, this was a issue. Data extraction into machine readable formats remains painful. Data extraction from PDFs remains even more painful: the priority of PDFs is prettiness for the human eye not prettiness for the machine.
Fortunately, pdftotext can extract the text data to plain text. But, even then, the John Boyd index text was misaligned and out of order due to its formatting in the original document. It also needed to be broken down into useful chunks that could be mapped to spreadsheet cells. I decided to use Freeplane to reformat the text into a form appropriate for piping into a spreadsheet since it has elements of asynchronous text editing.
I don’t know if a true asynchronous text editor exists. I’m not sure I know what one would look like. But I have some notion of what it isn’t. Most text editors and word processors are good at sequential editing of text. They only sort of approach asynchronous text editing where text is moved around and reordered freely without copy and pasting. Asynchronous text editing was what I wanted and Freeplane kind of does it.
I pasted the plain text into Freeplane and started breaking it down. Progress was slow. A lot of awkward and time-consuming cutting and pasting was required and this was annoying. I had to create additional text manipulation tools for Freeplane. Then things moved along nicely.
Due to intervening time constraints, the Boyd Papers index hasn’t made it to spreadsheet form yet. However, it is broken down in Freeplane. Though mind maps are most commonly used as a brainstorming tool, they are also useful for rearranging existing text data in a hierarchy. Since the John Boyd index mind map is a useful example of this, here’s what’s done so far:
- the index as an image (5.9 MB in size, require some magnification within the browser)
- original Freeplane mindmap (536.7 KB in size)
May 21st, 2013 at 10:30 pm
This is interesting. Looking at your image file and breaking down the bibliography that way is pretty common in bibliometrics. The next stage would be to take each of those authors and work your way towards node association of the authors. Which author quoted which author. You’ll end up with constructivist versus reductionist graphs. There are some really neat tools for mind mapping and concept graphing. Here is an example of a poorly understood domain http://selil.com/CYBER/ and a more well understood area of knowledge here http://selil.com/CLOUD/ the second is much more constructivist. Linking concepts back to other concepts and not merely conceptually reducing them. Interesting stuff.
May 22nd, 2013 at 7:23 pm
” The next stage would be to take each of those authors and work your way towards node association of the authors.”
*
But to get to a node don’t you have to cross an edge, and that is one gap (TPMB)no one wants to cross?
May 27th, 2013 at 7:35 pm
During the 1990s, I went in to the NIH National Library of Medicine to do some work on structure of medical knowledge. They had UMLS, which had both a hierarchical and a mesh organization. They had a contract to load UMLS into a RDBMS. RDBMS table joins can be used to represent hierarchical relationships but it is terrible for any-to-any mesh. It turns out 9 months of new medical knowledge organization was taking longer than 9 months of elapsed time to normalize in a RDBMS-style database. I had worked on the original SQL/relational implementation (System/R) but at the same time I had also worked on a different type of relational database that supported arbitrary any-to-any relations (System/R RDBMS table structure was optimized for doing financial transactions). In any case, I was able to demonstrate loading/representation of the full UMLS structure in a couple pf weeks (compared to the enormous amount of time and resources it takes for RDBMS normalization).
I’ve used the technology also for demonstrating merged taxonomy and glossaries for organization of financial and security knowledge by generating HTML files attempting to represent some of the organization (subset of knowledge representation).
The agency responsible for both Orange Book and Common Criteria has admonished me in the past about merging both viewpoints (they wanted a complete transition from the old to the new viewpoint)
I have also done something similar for the Internet standards and documents with several different HTML files representing several different views/facets of the information.
May 28th, 2013 at 11:42 pm
OK, I parsed your mind map and merged it with the Patterns of Conflict Bibliography and generated a rough HTML file somewhat along the lines of what I do for internet standards.
See temporary demo here:
http://www.garlic.com/~lynn/boydread.html
If you click on the title lin it will do a Google search for the title + author(s). In some cases, the Google results will point to a scanned “Google Book”.
June 21st, 2013 at 9:55 pm
Recently, I’ve updated with some of B&B 2012 reading list. As an aside … I periodically resort to emacs for parsing pdf2txt (& other stuff) to somewhat regular form.