This is an old revision of the document!


Searching Skylark


Variable metadata searching is at the heart of Skylark. The system has been designed to offer a range of searches, from simple keyword to more detailed combinations.


Simple Searching

Instant Keyword Search

After logging in, the main menu bar contains a grey ‘search’ box, which is retained throughout Skylark. This will perform a simple keyword search at any time during your Skylark session. Keyword searches will return matches from variable names or labels, and are not case sensitive.

N.B. multi-term keyword searches are run as ‘exact phrase’ searches, e.g. if you wanted to search for fish and chips you must include that whole phrase, you couldn’t just type fish chips


Main Search Menu

The options on the main Skylark search menu can be seen below:


Skylark search options


Simple search options are as follows:

  • Variable name – search for one or more specific variables by their names. Must be the exact name(s), no fuzzy searching. Multiple variable names must be separated by a single space
  • Keyword – as above, search variable names and labels for keywords
  • Year – a dropdown box to select the year of data you are interested in (some years containing a lot of variables have been split into questionnaires)
  • Category – NSHD data are categorised into 27 broad groups, from ‘wellbeing’ to ‘education’. Select from a dropdown menu. Some categories give a large number of results
  • Topic - the data have been broken down further, into topics. Select from a dropdown menu. The topic guides in the metadata repository include variable lists and 'standard topic baskets' to view and save
  • Library1) – the data consist of around 400 library files, roughly grouped on topic. Select from a dropdown menu


Keyword is the most common search, however if your term is broad then this will result in a large number of hits and can slow the process. Selecting a keyword search from the menu gives a Soundex option. This is a broader, phonetic-based search, which will give results containing keywords that sound similar to your search term(s), and results in many more hits.


Topic searching is a new feature - selecting from a drop-down list of the main topics of data collected in NSHD will load a pre-selected set of the most commonly-used variables on that topic. You can add these variables to your basket in bulk, and then add or edit the contents (if desired) prior to saving. However, please note that the standard topic baskets are not an exhaustive list of everything we hold on a topic, or necessarily contain the variables that are best for you. They are generally summary variables, and have been chosen based on previous usage and popularity.


Library searching can be another very useful ‘topic’ search, for example the library ‘Alcohol14’ contains all the variables on alcohol use collected in 2014. However, only the more recent libraries have contextual names, with many of the older data being housed in libraries called ‘B01’ or ‘Y79’, which give no idea of their contents.

A guide to the NSHD data libraries and their contents is available.

  • However, please note that not all the libraries listed in the guide are available on Skylark. Sensitive or possibly disclosive libraries may have been removed to protect the study and the identity of its participants.
  • Also, many of the libraries relate to various sub-studies that have taken place over the years. In some cases, the variables will contain only a few hundred cases, rather than the full sample. Please use the frequency tables for individual variables present on Skylark to check the details, prior to adding to your basket.



Combination Searching

A variety of compound Boolean AND searches can also be undertaken on Skylark:

  • Keyword and year
  • Keyword and category
  • Year and category
  • Keyword, year and category


Soundex can also be applied to all combination searches containing keyword.



Restricted Variables

Due to restrictions placed on us by NHS Digital, we are unable to share certain restricted data outside of the Unit. These variables will not appear in any search results. Data restricted in this way includes:

  • Mortality data
  • Cancer registrations
  • Hospital episodes (HES)

If you think your project may require access to these kinds of data, please contact us. It may be possible for you to access these data from inside the Unit.

N.B. Please note that other sensitive or potentially disclosive variables may still be shown in search results. However these are unavailable for external use and will be removed from baskets prior to processing.



Search Results

Whatever kind of search you perform, the results table will always appear the same way, showing the number of hits and details of the variables.


The example below shows the results for a keyword search on ‘blood pressure’

Blood Pressure search results


Results are listed alphabetically by variable name.


Here, you have the option to add one or more of these variables into your basket



Click on a variable to view more detailed metadata, including a frequency distribution and crosstab by sex. Many of the more recent datasets have extra documentation linked, including a data cleaning guide to that topic and references to published papers.


variable metadata


1)
Libraries are also known as library files, card numbers, or datasets - these terms may be used interchangeably