PSC reports come in all shapes and sizes making data extraction from them a laborious, manual error-prone process. A lot of times these are hand written reports and vary as per the regime being followed in that particular port where the vessel is visiting. Since the report contains details on the vessel's compliance status, deficiencies and general health of the vessel, a PSC report is of a lot of importance to the vessel owner/company as it has a direct impact on the ability to conduct business (The worst outcome from a PSC inspection can be detention of the vessel). Hence vessel owners and companies take great pains to monitor these reports and try to extract actionable insights from these. However the biggest challenge faced by them is the non-standardized nature of these reports. Most of these are received as scanned PDF documents, some hand written, smudged, some contain tables while some follow 'x' marks on checkboxes, etc. Given the complexity of these reports, we found LlamaParse to perform outstandingly well as a parser of these reports and it is hence the workhorse of the solution we have developed. The user is able to upload PDF documents through an interface that is provided and in the backend, LlamaParse parses the document as a markdown file. The text from this file is then used in an extraction prompt for Llama 3.2 text model to extract structured data in json format. This extraction is on the basis of prompt engineering in conjunction with Pydantic classes followed by data validators. Once extracted, the data is then displayed to the user and is also pushed into a database in Supabase. Since the data is now not only consistent but also persistent, it will be used in the future to build a data-driven analytical application for the user, enabling actionable insights such as type of deficiency reported the most in a particular class of vessels, deficiencies based on the inspector, port wher the inspection is being undertaken etc.
Category tags:"This seems like not only a great application for the problem at hand, but also something that could be applied to many government agencies with a similar challenge. "
Katie Jordan
"Great flow and implementation"
Yahia Bsat