Metadata for Using and Understanding Software

All scientific communities have been linking research together for many years using references to related work in articles. Recently these communities are exploring options for linking to datasets and software. As part of this effort, the CodeMeta Project recently proposed a vocabulary for metadata for code based on schema.org.

A mapping between the codeMeta vocabulary and the ISO 19115-1 metadata standard was recently included on the codeMeta Git Repository. Creating this mapping surfaced some interesting differences between these two approaches. An interesting similarity also emerged. The codeMeta vocabulary has been mapped to over twenty metadata dialects listed along the bottom of this Figure. On average, these dialects mapped 11.2 of the 68 codeMeta terms. The ISO mapping is shown near the left edge of this Figure. It included 64 of the 68 items. This indicates that the codeMeta and ISO dialects are more similar than many of these other dialects.

The difference likely reflects the fact that most of these dialects focus primarily on citation and dependency identification while ISO and codeMeta include metadata that supports use and understanding of data and software. This broader scope requires more metadata concepts, many of which codeMeta and ISO share. Check out the details in Mapping ISO 19115-1 geographic metadata standards to CodeMeta in PeerJ Computer Science.

This Figure shows the number of codeMeta terms that are included in mappings to many dialects. The red line shows the average number for many of these dialects (11.2). The two bars on the left show codeMeta and ISO 191156-1. The ISO mapping includes sixty-for of sixty-eight codeMeta terms. This similarity reflects the fact that both of these dialects include metadata to help users use and understand software as well as cite it.

This Figure shows the number of codeMeta terms that are included in mappings to many dialects. The red line shows the average number for many of these dialects (11.2). The two bars on the left show codeMeta and ISO 191156-1. The ISO mapping includes sixty-for of sixty-eight codeMeta terms. This similarity reflects the fact that both of these dialects include metadata to help users use and understand software as well as cite it.