How Wikidata might help the Smithsonian with its mission to diffuse knowledge

Clara de Pablo is a Fellow in the Office of Communications and Marketing at the Smithsonian’s National Museum of American History. She enrolled in Wiki Education’s introductory Wikidata course to learn more about how to apply linked data practices to her work.

Clara de Pablo

My involvement with Wikidata began — as all great stories do — with a long and meandering intern task. I work in communications at the Smithsonian’s National Museum of American History, and we’d decided to invite members of Congress to a preview of an upcoming exhibition. To narrow down the 535 members of congress, we asked our high-school intern Sofia to make a list of all the representatives with daughters. We thought this would take her a few days. A full month later, Sofia was still Googling senators and typing the ages of their daughters into a Word document. The information existed, but it wasn’t searchable or organized in a usable way. 

Wikidata seemed like a perfect foil. Wikidata steps in essentially as Wikipedia for data — so much data exists online, and in theory, Wikidata makes it easy to search and navigate. A week or so into Sofia’s research project (she was on senators whose last names started with “S”), I signed up for a Wikidata course. 

The course itself was set up in a very friendly, helpful way. The class met every Tuesday afternoon via video call, during which our instructors, Will and Ian, would screen share and show us how to navigate or use one aspect of Wikidata. The weeks were organized into tangible objectives — learning what “item” or “property” meant, learning how to edit entries, learning how to query. Before each meeting, there was a short slideshow tutorial on our class dashboard, which introduced the concepts and often guided us through short exercises to apply them. The video calls were especially useful for troubleshooting places we were getting stuck, or to see what other people in the course were doing with their newfound skills. The instructors often recommended queries to look at for inspiration, and made themselves available to answer questions via email or Slack outside of our meeting times. 

Queries in particular demanded special attention. Wikidata is searchable through the process of querying— using the computer language SPARQL to “ask” queries and sort a massive amount of data into the answering dataset to a specific question. As the course progressed, it became very apparent that Wikidata carries a steep learning curve. A few members of the class had figured out how to use queries to make elaborate interactive data trees; I had only succeeded in changing “Instance of: dog” to “Instance of: cat.” For fun, I tried to see if I could make a chart of the US Senators. I was frustrated for nearly an hour before I realized that I needed to add the boundary “Instance of: humans.”

This points to a fundamental challenge within Wikidata: linked data is only useful or usable to people who understand querying. Without access to a months-long course, I would never have been able to figure it out. Even with the help of the course, there is still a lot about Wikidata left to learn before I can build my own datasets in a meaningful way. This barrier to usability keeps most people from joining the Wikidata community — and, just like Wikipedia, Wikidata works best when more people contribute their expertise to it. 

The course helped alleviate some of these barriers for me — I am able to create and edit items, and adapt existing queries to answer simple questions. I feel confident that I could create usable datasets using what I learned about items and properties, and link them to existing Wikidata entries. The course helped dismantle something that looked intimidating on the outside — strings of numbers that defined properties, a scary new coding language — and broke them down into a series of simple, logical steps. Ultimately, the greatest benefit of joining was having access to teachers who could help answer my questions when I got stuck. 

Wikidata has great potential in the museum field if it becomes more user-friendly. The Smithsonian (and cultural institutions across the country) have an incredible treasure trove of data and information in our collections, but it’s poorly (if at all) accessible to members of the public. The ability to use linked data to search our digital collections would make our information usable to anyone who wanted it. For example, take classroom education. Using linked data, teachers and students could easily search for historic events from the same year, baseball gloves owned by World Series champions, American presidents with pet pigs. Museum curators might know these things off the top of their heads from countless years spent in the collections, but the information in their heads isn’t searchable by the general public. 

The Smithsonian was founded in 1846 as an “establishment for the increase and diffusion of knowledge among men.” As the possibilities for sharing knowledge have rapidly expanded, the Smithsonian is racing to adapt to a technological world. The Smithsonian’s new Secretary, Lonnie Bunch, has declared one of his priorities to be making the Smithsonian “digital first.” Across the Smithsonian, hundreds of people are working to ensure that our digital databases reflect the full scope of the collections they represent. Linked data would help make these massive online stores useful and usable — all of the Smithsonian’s knowledge would be available to the public we serve. 

Wikidata could be an incredible resource for making data usable to the public, but linked data has a steep learning curve. Learning how to use it requires practice, a lot of patience, and a little bit of help sometimes. 

Interested in taking a course like the one Clara took? Visit to see current course offerings.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.