« Older Home
Loading Newer »

Beta site launch

05Oct08

We’ve just opened access to a test drive of the site at www.phylosophy.net. At present, you can search for individuals and institutions within the database, and explore connections between them using links. The most recent degrees and appointments from our core set of schools are included, as well as advisors. For a good, complete sample, check out our home institution, CUNY, as well as some of our recent PhDs, such as James Snyder, Fritz McDonald, and Christine Vitrano.

At this point, the data is still tabular, but we’re making steady progress on our first visualization, which should be an institutional timeline. Charts, graphs, and network maps should follow in the coming months. We’ve also disabled account creation and data editing/uploading for the moment, until the rest of our initial, verified dataset has been entered.

After you’ve had a chance to play around a bit, drop us a line in the Feeback section of Phylo forum or via email (phylo@phylosophy.net) with your initial thoughts on site design and usability.

The hundred years problem

15Apr08

Increasingly, I think we’re saddled with what I’m calling the “hundred years” problem. By that, I mean that from at least 2000 forward, it’s fairly easy to compile degree, appointment, and publication information, since (nearly) all of it is published on the web (and sometimes even available in RSS, XML, or flat data formats). Some of this harvesting is complicated by nonstandard metadata, but web-wide standards like Dublin Core are emerging to address these worries.

So much for the future. Let’s consider the more distant past—namely, information before 1900. Much of this isn’t available at all for minor figures in the field (which probably makes up the greatest percentage of the field), and information on major figures is the province of specialized historians and archival efforts. Google Books and the Universal Digital Library are making some headway in archiving older materials, but the process is slow-going and it’s limited to books at the moment (we are, after all, interested in other records as well). Incidentally, UDL estimates that no more than 10 million of the 100 million books since recorded history were written before 1900. Those 10 million will be a huge task, but the bigger task is 1900-2000, at least by the numbers game.

And that’s where we’ve entered. In focusing on North American philosophy since the first dissertations in the 1880s, we’ve started off Phylo right in the middle of these hundred years of densest material. The problem, of course, is that it’s close enough to the present to obtain, yet time-consuming and costly enough to present a real deterrent. We will, of course, have plenty of this information from the start, given the longevity of the programs we’ve chosen to research. But complete saturation looks almost as difficult here as it does for pre-1900 data, where we often don’t know how much exists (and thus how complete our current records are).

Recognizing this problem has led us to think more about our longer short-term goals. Without a great chance of success in filling in 1900-2000 data, it might make sense to start expanding back further, to pre-1900 information that historians already have available. We’ve always know this will require some conceptual changes (e.g., ‘degree’ and ‘institution’ need to be understood more metaphorically as periods of study and places where philosophy happens). In light of the hundred years problem, though, it might be useful to make these changes sooner and start collecting more varied data from earlier periods in philosophy.

ISI Web of Science

02Apr08

David and I both attended presentations on ISI Web of Science today. WoS is taking an interesting and, in many ways, different approach as a search tool. Here are a few of the things that stood out:

  • Keywords are de-emphasized. There is no taxonomy associated with WoS (since it is so interdisciplinary in scope), so users are encouraged to search by authors (including their home institutions) and particular publications. WoS does assign keywords to articles using an algorithm that looks at titles and summaries, so users can search by topic, but it’s certainly not the preferred method.
  • Influence is understood in terms of citations. Each record is tagged with as many citation links as possible (only journal articles are included). As searchers, we were shown how to find the handful of mega-articles that hundreds of other articles on a topic all cite in common. If this really is a good measure of influence, it seems possible that one could jump into any topic knowing virtually nothing about its major players and sift them out from pure citation counts.
  • H-scores. Certain Doubts has had several posts about h-scores in the past few months, so I’ll simply refer you to discussions on 29 Nov, 13 Dec, 15 Dec, 17 Dec, 19 Dec, and 28 Dec.
  • Search queries seem pretty user-intensive. There’s no fuzzy search capabilities (”Did you mean X?”), so there was a lot of emphasis on wild card and truncated search strings. (See below.)
  • Some attempt at visualizations. I noticed two kinds of citation reports available for viewing. One shows the number of publications returned for any search; the other shows the number of citations within that publications set. These charts are static images generated upon request, and seem similar to Scopus’ visual capabilities (although I wouldn’t know because the server always times out before my image is generated by Scopus). Here are the two charts I generated for “rawls AND justice”.

WoS has data for arts and humanities going back to 1975, and I think it will be interesting to see how much it catches on in the humanities and in philosophy. One general limitation—one that I raise in the An Introduction to Phylo—is the way in which this tool makes the user do the work, rather than the other way around. I was struck by how much presenter of the session was essentially training us to work with the tool by favoring publication data over keywords and filtering searches in certain ways, rather than giving us an intuitive tool that worked however we found most natural. In general, I think this underscores the need for more participatory design in building search tools.

Beyond just asking users what they think of the tools we’ve built, we need to learn more beforehand about how they process information and in what forms they find that information most cognitively salient. I think we’ll learn some of this once we launch and revise our displays, and I hope we can come up with some model of participatory design that facilitates the process.

External readers

26Mar08

While processing dissertation title pages, I’m finding a number of signatures that don’t belong to any of the faculty members listed for the department. In some cases, these are faculty from other departments (e.g., linguistics, Greek) at the same institution, but in many cases, I suspect these are external reviewers from other programs. Unless these faculty are named in (say) an acknowledgment page, there’s virtually no way to figure out who signed. (I’m wondering, in general, how frequent external readers are for North American institutions.) My guess is that we’ll have to develop some kind of “wish list” for this kind of missing data.

Criteria for inclusion in Phylo

02Feb08

David and I have been revisiting the issue of who should be included in Phylo. Since our goal is to provide a resource tool for the entire field, it makes sense to define our criteria quite broadly. Our nearest cousin, The Mathematics Genealogy Project, takes a similar approach: “Throughout this project when we use the word “mathematics” or “mathematician” we mean that word in a very inclusive sense. Thus, all relevant data from statistics, or computer science or operations research is welcome.”

While we think this broad approach is merited, there are a few complications. First, philosophy seems to have more interdisciplinary connections than mathematics. Some issues in political theory, classics, literature, psychology, physics, and so on are arguably philosophical issues, so broad standards for Phylo would probably include a lot more people than broad standards for MGP. Also, given that we also plan to export Phylo’s capabilities for use in other disciplines, we’d like to maintain some kind of boundary over what counts as philosophy and what counts as other disciplines, otherwise we’ll have huge redundancies across these systems. These boundaries aren’t going to be hard-and-fast, but they should generally reflect the familiar people and publications specific to our individual fields. After thinking it over, we’ve come up with three individually sufficient criteria for inclusion in Phylo:

Philosophers included in Phylo have (a) received a doctoral degree in philosophy, (b) taught in a philosophy department, or (c) published an article in a philosophical journal or a book categorized under the Library of Congress subject heading ‘Philosophy’.

(a) and (b) generally capture who has studied and taught philosophy. Of course these may not apply to philosophers working before 1860, but including them in Phylo will require some changes to our database itself, in addition to these criteria.

For the moment, I’m more worried about how well this nets continental philosophers working departments (e.g., comparative literature) outside of mainstream philosophy departments. My hope is that these philosophers will still fall under (c), which is rather broad in its own right and includes headings for ‘existentialism’, ‘phenomenology’, ‘hermeneutics’, and ‘literary theory’. It may also help that figures like Foucault, Lacan, Deluze, and Zizek are all tagged with ‘philosophy’ in sources like Amazon, which should contribute some of our publication information. I’m not completely satisfied that this handles the issue, but any broader criteria we discussed seemed to include academics from too many other disciplines.

I’m curious what others think about these criteria. Is there anyone who is left out that clearly should be included? Are there other criteria that would be more accurate or representative? Thoughts and suggestions are always welcome.

Faculty data entered

21Jan08

We’ve just finished our last import of faculty appointments into the database. The grand total looks like 1,715 philosophers, give or take a few duplicate entries and unresolved names. These professors held over 3,000 appointments, so even at this very abstract level, you can already get a sense of how closely our 20 current schools are connected.I did some quick sorting of the list to pull out the professors with the most appointments and longest career spans. No surprise—these turn out to be most of the household names in the field. After adding a few new figures, I used the list to generate a background for a project image. Here’s a low-res sample of it:name_warp

Project poster

05Dec07

We have a (hopefully) permanent and updatable project poster up at The New Media Lab website that includes RSS feed from this blog and hopefully, in time, recent topics from the forum. Visit http://www.newmedialab.cuny.edu/phylo/.

As usual, we’ll continue to post regular updates on the project to this blog.

Flex and Google Maps

29Nov07

We’ve tentatively chosen two applications to run our visualizations: Adobe Flex, which would handle netMap and chronoMap, and Google Maps, which would run geoMap (probably with a Flex overlay).

The choice for netMap and chronoMap was a tough one. There are few nice, open source tools out there, including prefuse, Simile Timeline, and Simile Exhibit. All of these are free, and very much in the spirit of Phylo. But each has its drawbacks, and using three different tools to run visualizations might slow down loading time and make integrating displays difficult. There’s also some worries about getting any Java-based tools to perform reliably in different browsers.

Flex overcomes a lot of these worries. It runs in a Flash environment (which is standard across all browsers) and it allows us to implement netMap and chronoMap in a single application. It also has some neat animated transitions, which you can see at http://www.adobe.com/devnet/flex/samples/dashboard/dashboard.html.

The choice to go with Google Maps API was a bit easier. The application is constantly being expanded, and there are lots of ways to customize it for our needs. Ideally, we’ll overlay some Flex elements on Google Maps, but it’s hard to say where that technology will be by the time we launch.

At any rate, you can expect to see some slick and consistent visualizations run by Flex.

drupal implementation

20Nov07

Phylo will run on drupal, an open source content management system (CMS). After a few months of development, we’ve realized the need for several things:

  • a robust tracking system that can record additions and changes, including user information and comments on why changes are being made,
  • a secure sign-in area for user information,
  • a search engine with advanced capabilities, and
  • a general system that can be updated without heavy time investment in new coding.

drupal is one of the leading CMSes and will be able to meet all of these needs, and more. One of the most user-friendly features will be a site-wide login system. Once you’re signed in through the website, blog, or forum, you’ll automatically be signed in to the other two as well, allowing you to upload or change information, make comments, and post messages without addition logins. Hopefully this will help to encourage discussion on information, since the forum will always be one click away through the main menu.

drupal will also come in handy long-term as well. Once the core visualizations and functions are complete, we’ll be able to export a Phylo drupal module for use in other fields. So when someone in (say) English or sociology wants to create a Phylo in their own discipline, they’ll just need to install drupal and then activate the Phylo module.

Phylo and InPhO

24Oct07

At NA-CAP 2007, we were introduced to The Indiana Philosophy Ontology Project (InPhO), which is developing a dynamic formal ontology for philosophy. A main focus of the project is developing a way to handle metadata for the Stanford Encyclopedia of Philosophy (SEP). As many of you know, the current SEP entries are searchable and listed alphabetically. With InPhO, they should soon appear in hierarchies, with narrower entries (e.g., higher-order thought) falling under broader categories (e.g., consciousness).

The beauty of InPhO is that it’s continually updated by running statistics over SEP entries to identify likely relationships between terms. It also uses a bit of expert input to refine these relationships.

At the moment, we’re talking to Colin Allen and Cameron Buckner about using InPhO to taxonomize publication information, including dissertations. This has several upshots:

  • It eliminates the need for us to create yet another keyword hierarchy in philosophy. That should keep down online clutter and free up our time to work on other parts of Phylo.
  • It will standardize ontology across Phylo and SEP, which should make searching different sources easier.
  • It guarantees that any keyword in Phylo has corresponding information available through SEP.

We’ll keep you updated as things develop with InPhO. In meantime, you can browse the first iteration of the ontology at http://inpho.cogs.indiana.edu:16080/taxonomy/.