Millersoft Ltd

Data Nessie for AWS Data Lake Automation

Data Nessie is a lightweight migration service that moves data from popular databases into AWS S3 for analysis. Selected source tables are automatically copied into S3 in parquet format and surfaced in AWS Athena via GLUE catalogue. Change data capture (CDC) is performed to record the table history in S3.

Features

  • No complex ETL to write
  • No legacy code to maintain
  • No training required
  • No operational system changes
  • No database changes
  • No new servers or licences required
  • No need for invasive database triggers

Benefits

  • Faster Data Lake development
  • Reduced Cost of Ownership
  • Flexible; Reconfigure the Data Lake in minutes.
  • Zero technical debt to accumulate
  • Full service support from our Data Lake and AWS experts
  • Extensible; populate your Data Lake from all operational systems
  • Firstclass AWS citizen, fully integrated with GLUE, Athena and S3
  • GDPR compliant via automated tokenisation of PII
  • Automates the generation of operational spot positions
  • Automated reconciliation from source to Data Lake

Pricing

£183 to £5,111 a server a month

  • Free trial available

Service documents

Request an accessible format
If you use assistive technology (such as a screen reader) and need versions of these documents in a more accessible format, email the supplier at gerry@millersoftltd.com. Tell them what format you need. It will help if you say what assistive technology you use.

Framework

G-Cloud 12

Service ID

5 3 9 8 2 9 7 6 5 5 6 2 1 1 6

Contact

Millersoft Ltd Gerry Conaghan
Telephone: 0131 376 7114
Email: gerry@millersoftltd.com

Service scope

Software add-on or extension
No
Cloud deployment model
Public cloud
Service constraints
Data Nessie polls individual tables for changes therefore each table copied into the data lake must have a field indicating the row that has changed. This Change Data Capture (CDC) column can be a timestamp, a sequence or in the case of MS SQL Server a rowversion

Polling CDC solutions can miss some operational changes if multiple changes happen to records between polls. However, as the database server impact of Data Nessie is lite the poll can be frequent.
System requirements
  • Migrates on prem database to AWS S3 via SSH tunnel
  • Migrates AWS RDS database to AWS S3 directly

User support

Email or online ticketing support
Email or online ticketing
Support response times
Depends on SLA, normally within 4 hours
User can manage status and priority of support tickets
Yes
Online ticketing support accessibility
None or don’t know
Phone support
Yes
Phone support availability
9 to 5 (UK time), Monday to Friday
Web chat support
No
Onsite support
Yes, at extra cost
Support levels
L1: Tier/Level 1(T1/L1)
Initial support level responsible for basic customer issues. Gathering formation to
determine the issue by analysing the symptoms and figuring out the underlying problem.
L2: Tier/Level 2(T2/L2)
This is a more in-depth technical support level than Tier I containing experienced and more
knowledgeable personnel on a particular product or service.
L3 Tier/Level 3(T3/L3)
Individuals are experts in their fields and are responsible for not only assisting both Tier I and
Tier II personnel, but with the research and development of solutions to new or unknown
issues.
Severity Definitions
1- Critical: Proven Error of the Product in a production environment. The Product Software
is unusable, resulting in a critical impact on the operation. No workaround is available.
2- Serious: The Product will operate but due to an Error, its operation is severely restricted.
No workaround is available.
3- Moderate: The Product will operate with limitations due to an Error that is not critical to
the overall operation. For example, a workaround forces a user and/or a systems
operator to use a time consuming procedure to operate the system; or removes a nonessential
feature.
4- Due to an Error, the Product can be used with only slight inconvenience.
Support available to third parties
Yes

Onboarding and offboarding

Getting started
Documentation and Training Videos https://datanessie.com/documentation/
Service documentation
Yes
Documentation formats
  • HTML
  • PDF
End-of-contract data extraction
All data resides inside the customers Amazon account.
No data or passwords are retained on the Data Nessie server
End-of-contract process
No termination charge, pay per use model.

Using the service

Web browser interface
Yes
Supported browsers
  • Internet Explorer 10
  • Internet Explorer 11
  • Microsoft Edge
  • Firefox
  • Chrome
  • Safari 9+
  • Opera
Application to install
No
Designed for use on mobile devices
No
Service interface
No
API
No
Customisation available
Yes
Description of customisation
We can accommodate and support custom configuration requests.
The PII hashing algorithm can be configured by the end user.
Users select which databases and tables are copied into the data lake.
Users select columns for tokenisation.

Scaling

Independence of resources
Supports AWS Athena for serverless query access via SQL at scale.

Analytics

Service usage metrics
Yes
Metrics types
Logs and status uploaded to Cloudwatch for Analysis
Easily integrates into common dashboard tools
Reporting types
  • Real-time dashboards
  • Reports on request

Resellers

Supplier type
Not a reseller

Staff security

Staff security clearance
Other security clearance
Government security clearance
Up to Developed Vetting (DV)

Asset protection

Knowledge of data storage and processing locations
Yes
Data storage and processing locations
  • United Kingdom
  • European Economic Area (EEA)
  • EU-US Privacy Shield agreement locations
User control over data storage and processing locations
Yes
Datacentre security standards
Complies with a recognised standard (for example CSA CCM version 3.0)
Penetration testing frequency
Never
Protecting data at rest
  • Physical access control, complying with CSA CCM v3.0
  • Physical access control, complying with SSAE-16 / ISAE 3402
  • Encryption of all physical media
  • Scale, obfuscating techniques, or data storage sharding
Data sanitisation process
Yes
Data sanitisation type
Deleted data can’t be directly accessed
Equipment disposal approach
Complying with a recognised standard, for example CSA CCM v.30, CAS (Sanitisation) or ISO/IEC 27001

Data importing and exporting

Data export approach
All user data resides in AWS S3
All user meta data is backed up to AWS S3
Data export formats
Other
Other data export formats
All operational data is stored in S3 in parquet format
Data import formats
  • CSV
  • Other
Other data import formats
JDBC Data Source

Data-in-transit protection

Data protection between buyer and supplier networks
  • Legacy SSL and TLS (under version 1.2)
  • Other
Other protection between networks
Can also encrypt prior to transfer
Data protection within supplier network
TLS (version 1.2 or above)

Availability and resilience

Guaranteed availability
Customer dependent.
Approach to resilience
AWS services are delivered from multiple datacentres worldwide. When deploying customer services to AWS, DataNessie can be configured such that services span multiple availability zones (data centres) to ensure service resilience. Alternatively, our Disaster Recovery as a Service offer can be used to provide DR.
Outage reporting
AWS Cloudwatch alerts can be created

Identity and authentication

User authentication needed
Yes
User authentication
Username or password
Access restrictions in management interfaces and support channels
Access to management interfaces and support channels is restricted through a combination of username and passwords, multifactor authentication, firewalling, IP restrictions, the use of bastion hosts as appropriate.
Access restriction testing frequency
At least once a year
Management access authentication
  • 2-factor authentication
  • Public key authentication (including by TLS client certificate)
  • Username or password

Audit information for users

Access to user activity audit information
Users have access to real-time audit information
How long user audit data is stored for
User-defined
Access to supplier activity audit information
Users have access to real-time audit information
How long supplier audit data is stored for
User-defined
How long system logs are stored for
User-defined

Standards and certifications

ISO/IEC 27001 certification
No
ISO 28000:2007 certification
No
CSA STAR certification
No
PCI certification
No
Other security certifications
No

Security governance

Named board-level person responsible for service security
Yes
Security governance certified
No
Security governance approach
Our AWS Marketplace service is delivered via AWS CloudFormation and has been reviewed by AWS to ensure best practice is followed.
Information security policies and processes
Data Nessie follows AWS best practice on security https://aws.amazon.com/security/

Operational security

Configuration and change management standard
Supplier-defined controls
Configuration and change management approach
All code is under version control using git
Jenkins is used to build releases
An automated test framework is used for integration testing
Changes are tracked via jira
Cloudformation is used to deploy via AWS Marketplace
Vulnerability management type
Undisclosed
Vulnerability management approach
Solution is deployed into customer's AWS VPC via AWS Cloudformation
External access is configured via customer and GUI is locked down via AWS security groups
SSH access is also locked down via security group and PEM file.
The access is as secure as the customers network.
Patches are in the form of new AWS AMIs
Protective monitoring type
Supplier-defined controls
Protective monitoring approach
All logs go to AWS Cloudwatch for auditing, monitoring and alerting
Incident management type
Supplier-defined controls
Incident management approach
Each Data Nessie instance runs within a VPC within the customers AWS Account. There is no external access or monitoring. Issues need to be reported to the supplier and logs supplied for external analysis.

Secure development

Approach to secure software development best practice
Supplier-defined process

Public sector networks

Connection to public sector networks
No

Pricing

Price
£183 to £5,111 a server a month
Discount for educational organisations
No
Free trial available
Yes
Description of free trial
Full access for 30 days

Service documents

Request an accessible format
If you use assistive technology (such as a screen reader) and need versions of these documents in a more accessible format, email the supplier at gerry@millersoftltd.com. Tell them what format you need. It will help if you say what assistive technology you use.