Awarded to Wallscope

Start date: Monday 25 October 2021
Value: £60,500
Company size: SME
The National Archives

Design, develop and deliver a new, cloud version of the PRONOM web service and file format registry

3 Incomplete applications

2 SME, 1 large

9 Completed applications

8 SME, 1 large

Important dates

Published
Wednesday 8 September 2021
Deadline for asking questions
Wednesday 15 September 2021 at 11:59pm GMT
Closing date for applications
Wednesday 22 September 2021 at 11:59pm GMT

Overview

Off-payroll (IR35) determination
Supply of resource: the off-payroll rules will apply to any workers engaged through a qualifying intermediary, such as their own limited company
Summary of the work
Design, develop, and deliver a new, cloud-based, graph (RDF) data version of the PRONOM web service and file format information registry, including editorial workflow functionality and improvements to the user interface.
Latest start date
Monday 4 October 2021
Expected contract length
An initial 2 months for Alpha delivery, option to extend for a further 4 months for Beta delivery
Location
No specific location, for example they can work remotely
Organisation the work is for
The National Archives
Budget range
Alpha: 2 months £70k (ex VAT).
Beta (optional): 4 months £140,000 (ex VAT)

About the work

Why the work is being done
The PRONOM registry (https://www.nationalarchives.gov.uk/PRONOM/Default.aspx) is an internationally renowned resource used for digital preservation planning by memory institutions around the world, however the platform it currently runs on is outdated. This work aims to modernise the platform to ensure it is fit for the future.
We require a new, cloud-based service, backed by an RDF graph data model to ensure its data can be consumed in a variety of formats.
Problem to be solved
A new, secure, cloud-based web service is required to make PRONOM data available through appropriate mechanisms, including through API access, as well as in a form that the DROID file format identification utility can consume. The core web service will be made available via The National Archives’ web estate.
• Understand user need: review user research, engage with users to validate these findings and create and prioritise user stories
• Review and refine the prototype graph data model.
• Design and build a new web service, front-end and API for external users.
• Design and build a new back-end ‘editorial’ service to enable archivists and other privileged users to add and edit PRONOM data in a secure and intuitive manner
• Create and iterate prototypes to deliver an initial Alpha prototype and test and refine this to deliver a working public Beta service.
• The service must meet NCSC standards for system security, GDS service standards, and WCAG 2.1 Accessibility standards.
• The service must be tested against the specified standards and evidence provided that the criteria have been met.
• The supplier should recommend a development environment /infrastructure as part of the response.
Who the users are and what they need to do
The PRONOM registry provides information about file formats that help inform preservation planning for digital records. Internally, the service is primarily used by Digital Archivists working with record data. External users include archivists, librarians, academics, and others concerned with digital preservation planning and digital information management.
Additionally the data held in PRONOM is consumed by the DROID file format identification utility, and other third party identification utilities.
The supplier must test the new service appropriately to ensure that it meets user needs and expectations.
Early market engagement
We have reviewed technical approaches and engaged with specialists to identify preferred technical approaches including RDF, the Alpha should make further recommendations on the technology approach.
Any work that’s already been done
A previous ‘Discovery’ project ran December 2019-March 2020. This produced prototype artifacts including a graph data model and back-end editorial services (https://github.com/digital-preservation/pronom2020).
It is anticipated that this new project will review and build upon this Discovery work.
Existing team
The supplier’s team will deliver the work. The National Archives staff will include a Product Manager with experience of the current system and a Delivery Manager. Additional support from A Data Engineer and Technical Architect, and the core team of Digital Archivists who are the primary users of the service.
Current phase
Live

Work setup

Address where the work will take place
The National Archives, Kew, Richmond, Surrey TW9 4DU
Supplier to work remotely.
Working arrangements
The National Archives’ staff will be available during UK 9am-5pm working day. The supplier will provide their own equipment and technology and will be given access to our organisational GitHub and Slack resources as appropriate. The supplier will work in accordance with Agile methodologies to scope, plan, and deliver the work incrementally, with regular active communication, and will conduct regular ‘show and tell’ sessions to demonstrate progress. Regular meetings will take place via Microsoft Teams with Slack available for quick communication. There can be flexibility with communication methods if appropriate.
Security clearance
Baseline clearance will be required (BPSS)

Additional information

Additional terms and conditions

Skills and experience

Buyers will use the essential and nice-to-have skills and experience to help them evaluate suppliers’ technical competence.

Essential skills and experience
  • Demonstrable proven track record of delivering graph-based data web services for public consumption
  • Experience of delivering services that produce data in a variety of formats including XML, JSON and RDF
  • Experience in delivering web functionality that meets the international WCAG 2.1 AA accessibility standard
  • Experience of delivering highly secure services and understanding of government requirements for security, and meeting OWASP and CIS web application security best practices
  • Experience of working with an iterative, Agile approach to service delivery
  • Experience delivering to the Government Service Standard, including successfully passing service assessments
  • Experience of conducting user research and capturing usability requirements
  • Strong communication skills, including working collaboratively with our Subject Matter Experts to ensure the service meets the needs of its users
  • Ability to produce clear technical documentation
Nice-to-have skills and experience
  • An understanding of the needs and challenges of digital preservation and digital archiving
  • A familiarity with DROID and PRONOM and their use in support of preservation planning activities

How suppliers will be evaluated

All suppliers will be asked to provide a written proposal.

How many suppliers to evaluate
5
Proposal criteria
  • Evidence of delivering web services based on graph data technologies
  • Evidence of building robust AWS-hosted web servicesEvidence of building robust AWS-hosted web services
  • Evidence of meeting accessibility requirements
  • Evidence of meeting the GDS Service standards
  • Evidence of delivering services that meet OWASP, CIS and NCSC security guidance
  • Team structure, including the relevance of the team members' skills and experience
Cultural fit criteria
  • Work in an open and transparent way, sharing work in progress and involving others as you go
  • Explain what methods you propose to use to engage; communicate, constructively challenge and work effectively with our team
  • Describe how you propose to support positive working relationships throughout the life of the contract
Payment approach
Capped time and materials
Additional assessment methods
  • Reference
  • Presentation
Evaluation weighting

Technical competence

60%

Cultural fit

20%

Price

20%

Questions asked by suppliers

1. Is there any incumbent?
No.
2. Are there any mockups or designs for the frontend application?
No, but the existing PRONOM service is available here: https://www.nationalarchives.gov.uk/PRONOM/Default.aspx
3. Can we assume that AWS is the pre-selected cloud provider?
While we anticipate that AWS will be the cloud platform for the new version of PRONOM, we will consider alternative proposals
4. Is there already an operational budget set that can help determine how infrastructure is chosen?
The operational budget has yet to be determined and will be influenced by proposals produced during the Alpha phase.
5. Are the requirements between Alpha and Beta phases clearly defined?
The requirements are broadly as described in this opportunity. Requirements for Beta will be influenced by outcomes of the Alpha.
6. Should DevOps Continuous Integration/Deployment be considered in the scope of work or is this seen as operational costs?
Any Beta delivery will be expected to follow CI/CD principles, although the choice of CI/CD technologies will be subject to agreement.
7. Please explain how users location distribution looks like?
While PRONOM has visitors from around the globe, the vast majority of hits originate from Western Europe, North America, and Oceania.
8. What user types can we distinguish?
Users are typically concerned with long term preservation of digital data, and primarily work in the heritage sector (galleries, libraries, archives, museums), information/records management (particularly within the public sector), or further/higher education settings. Additional users may be interested in digital forensics, or any field/industry with a regulatory requirement to maintain records over a long-term duration (e.g. aerospace, energy, finance, etc.). Non-human users will likely be data consumers from digital preservation vendor systems, and may additionally include systems or services maintained by owners or engineers of similar or related datasets pertaining to data/digital preservation.
9. There are “privileged users” mentioned – what does it mean?
Most visitors will be anonymous with no need for user management. The editorial functionality will be initially expected to be reserved to The National Archives employees, but this could potentially be extended to include a small number of external users.
10. Should the solution include users management as well?
Only for the editorial functionality.
11. Should it allow to create/edit/delete/set roles for users?
The editorial functionality may include roles, e.g. editor, approver/reviewer, admin, but this is open to be explored within the Alpha phase.
12. Are there users groups?
As above, only for the editorial functionality.
13. If there are users how should they log in – google account, ms account, just login and password?
To be explored during the Alpha phase.
14. Is the “editorial service” a separate web application?
To be explored during the Alpha phase.
15. Will the front-end, back-end and editorial web services be hosted in an existing National Archives environment? Or does the supplier need to provide such hosting environments for the alpha and beta services?
It is anticipated that the Alpha phase and any requisite prototyping work could likely be conducted using offline technologies, whereas the Beta phase will require the set up and administration of a cloud-based environment, with the required account(s) provided by The National Archives, however it may be appropriate to deploy to cloud infrastructure earlier that Beta, in which case The National Archives will provide the requisite infrastructure.
16. Can you indicate the size of the user population who will be taking part in the user research? Could you give an idea of the types of user for this exercise; will they reflect the anticipated user base for the application (i.e. Digital Archivists working with record data, archivists, librarians, academics, and others concerned with digital preservation planning and digital information management)?
The anticipated user population for user research is small, and will likely primarily consist of digital archivists, librarians, academics, and information/records managers. It may also be appropriate to involve software vendors or others interested in consuming PRONOM’s data directly via an API/Linked Data endpoint.
17. Is there an existing set of use cases that we could use to define the set of API methods required for the different services?
This would be explored as part of the Alpha phase.
18. What is the current min/max/avg requests count per second/minute/hour?
Average no. of Page views - per minute 9, per hour 34
Max no. of Page views - per minute 110, per hour 496
Statistics are from the last 30 days
19. What is the expected min/max/avg requests count per second/minute/hour?
This is unlikely to increase significantly in the short term. It is anticipated that provision of a Linked Data API/Endpoint for PRONOM will result in an increase in the programmatic consumption of PRONOM data.
20. What is the current min/max/avg response time?
Avg. page load time - 1.97 seconds
Max page load time - 23.65 seconds
21. What is the expected max response time?
We don’t currently have specific targets for this.
22. What’s a min/max/avg size of the request/response?
This is not currently tracked, but as PRONOM is predominantly text-based, individual responses are likely to be mostly in the KB-low MB range.
23. How many concurrent users should the website handle?
Currently, PRONOM rarely has more than 100 users in a day and this is unlikely to change in the short term.
24. What is the current size of the database (RDF data files) or number of records?
Pre-migration to RDF, PRONOM as a relational MS SQL database is relatively small – a backup data dump is approximately 25MB.
25. What is the minimum expected SLA?
The PRONOM website doesn’t currently have a formal uptime target or guarantee, although users normally expect it to be available to access it 24/7. There is no current expectation for out-of-hours support.