What is the data about?
• The data consists of samples of data on the weather for eight locations over
two different time periods
• The five UK locations are:
o Leuchars: town in Scotland
o Leeming: village in North Yorkshire
o Heathrow: hamlet in Greater London
o Hurn: village in Dorest (South West England)
o Camborne: town in Cornwall (South West England)
• The three international locations are:
o Beijing: capital city of China
o Perth: capital city of Western Australia (state of Australia)
o Jacksonville: city in Florida (state of USA)
• The two time periods are:
o May to October 1987
o May to October 2015
What variables are included in the large data set?
• Daily mean (air) temperature
o Measured in degrees Celsius (°C) given to 1dp
o Average of hourly temperature readings between 0900 - 0900 GMT
• Daily total rainfall
o Measured in millimetres (mm) given to 1dp
o Measured for the 24 hours starting at 0900 GMT
o A trace of rain 'tr' is an amount less than 0.05mm
• Daily total sunshine
o Measured in hours (hr) given to 1dp
• Daily maximum relative humidity
o Given as a percentage given to the nearest integer
o A reading above 95% is associated with mist and fog
• Daily mean windspeed and direction
o Mean measured in knots (1 kn = 1.15 mph) given to nearest integer
and is described using the Beaufort conversion (calm, light, etc)
o Direction measured in degrees rounded to the nearest 10 and is given
as a cardinal direction (north, south, etc)
o Averaged for 24 hours starting at 0000 GMT
• Daily maximum gust and direction
o Measured using the same units as windspeed
o The maximum instantaneous speed over the 24 hours
• Cloud cover
o Measured in Oktas (eighths of the sky covered by cloud)
• Daily mean visibility
o Measured in decametres (1 Dm = 10 m) horizontally
• Daily mean pressure
o Measured in hectopascals (1 hPa = 100 Pa = 1 millibar)
Is the data complete?
• There are missing or unknown pieces of data
o These are listed as 'n/a' or '-'
o The total daily total sunshine, mean windspeed and maximum gust is
unknown for the first half of May 1987 for the UK cities
o The data should be cleaned before samples are taken
• The three international cities only contain data for:
o Daily mean temperature, daily total rainfall, daily mean pressure and
daily mean windspeed
What are some of the important features?
• Consider which locations are closer to the equator
• Consider which locations are near a coast
o Jacksonville, Perth, Camborne, Hurn, Leuchars are near the coast
• Consider which locations are in each hemisphere
o Perth is in the southern hemisphere so have winter when UK has
summer
• Consider which variables are discrete and which are continuous
o Cloud cover is discrete
• You can use 0 or 0.025 for rainfall that is listed as 'tr'
• The great storm of 1987 happened 15-16 October in UK
o The wind speeds were high at this time
o The south and south-east of England was affected
o This will skew some variables (wind/gust/rainfall)
o This won't have much impact some variables (sunshine/cloud cover)
▪ October in the UK is normally cloudy and has less sunshine
o Don't worry about remembering the exact dates of this but it is
something to be aware of
• Consider the number of days in each month
o 30 days in June and September
o 31 days in May, July, August and October
o In total the LDS covers 184 days