1.3 Data Caution
While GFW data are a powerful tool for analysis, there are a number of issues with AIS data which are important to consider. First, AIS has potential issues with reception quality, spoofing, and bad data. There are a number of situations where AIS messages might be broadcasted but not recorded by satellites. In order to receive a message, satellites must be overhead and terrestrial satellites will only record messages within line-of-sight (approximately 10 -100 nautical miles). AIS devices also vary in strength and high densities of vessels can interfere with each other preventing satellites from recording all vessel messages. There are two types of spoofing to be aware of: identity spoofing and location spoofing. Identity spoofing occurs when two or more vessels are simultaneously broadcasting the same vessel identification number, while location spoofing, or offsetting, refers to manipulation of the vessel’s AIS position to obscure the true location. Finally, bad data may show fishing vessel tracks in places that don’t make sense, like on land.
There are a number of solutions to deal with these AIS problems. To verify reception quality, it may be helpful to check the quality of coverage for your research area using world-fishing-827.gfw_research.reception_quality_quarterdegree_vYYYYMMDD
. Vessel info tables (such as world-fishing-827.gfw_research.vi_ssvid_byyear_vYYYYMMDD
) can be filtered to exclude potential spoofing or offsetting vessels using the activity record. It is also common to filter for vessel segments with more than 10 positions that are not overlapping and short. Examples of common noise filters to deal with these types of issues can be found in the example queries on the GFW GitHub. The best way to deal with potentially bad data is to be diligent in checking, visualizing, and critically evaluating the data. GFW is constantly improving their pipeline process and vessel lists with help from research partners. Any bad data or possible data issues can be reported to the GFW team through the emlab-gfw slack channel. Posts should tag Tyler Clavelle on the GFW team.
It is also important to consider how AIS has changed since it was first adopted, especially if using AIS data for time series studies. Both AIS reception quality and use of AIS have improved and grown in recent years. Time series studies should consider ways to deal with this, either by restricting analyses to only those vessels broadcasting at the beginning of the study time period or restricting analyses to 2016 onwards.
While GFW data is accurate, it relies on the accuracy and transparency of public fisheries data. It is important to critically analyze the data before using them and to understand what can and can’t be said. For example best practice is to use terms like apparent fishing effort rather than fishing effort, and encounters and loitering events instead of transshipment events. Further, it’s important to recognize that AIS data only covers a fraction of the world’s fishing fleets. The Global Atlas of AIS-based Fishing Activity describes in more detail which fraction of the global fishing fleet is believed to be represented in the existing AIS data.