Data Integrity and Lineage by using DLT, Part 2

(Also published at Veracity Blog)

Other articles in this series:

  • Part 1
  • Part 2 (this article)
  • Part 3

In my previous article, we discussed different approaches for solving the data integrity and lineage challenges, and concluded that the “Hashing with DLT“ solution is the direction we will move forward. In this article, we will have deep dive into it. Please not that Veracity’s work on data integrity and data lineage is testing many technologies in parallel. We utilise and test proven centralized technologies as well as new distributed ledger technologies like Tangle and Blockchain. This article series uses the IOTA Tangle as the distributed ledger technology. The use cases described can be solved with other technologies. This article does not necessarily reflect the technologies used in Veracity production environments.

Which DLT to select?

As Veracity is part of an Open Industry Ecosystem we have focused our data integrity and data lineage work using public DLT and open sourced technologies. We believe that to succeed with providing transparency from the user to the origin of data many technology vendors must collaborate around common standards and technologies. The organizational setup and philosophies for some of the public distributed ledgers provides the right environment to learn and develop fast with an adaptive ecosystem.

Read More

Share Comments

Data Integrity and Lineage by using DLT, Part 1

(Also published at Veracity Blog)

Other articles in this series:

  • Part 1 (this article)
  • Part 2
  • Part 3

Introduction

With the proliferation of data – collecting and storing it, sharing it, mining it for gains – a basic question goes unanswered: is this data even good? The quality of data is of utmost concern because you can’t do meaningful analysis on data which you can’t trust. Here in Veracity, we are trying to address this is very concern. This is a 3 part series, going all the way from concept to a working implementation using DLT (Distributed Ledger Technology).

Side note, Veracity is designed to help companies unlock, qualify, combine and prepare data for analytics and benchmarking. It helps data providers to easily onboard data to the platform, and enable data consumers to access and mine value. The data can be from various sources, such as sensors and edge devices, production systems, historical databases and human inputs. Data is generated, transferred, processed and stored, from one system to another system, one company to another company.

Veracity is by DNV GL, and DNV GL has held a strong brand for more than 150 years as being a trusted 3rd party, yet it is still pretty common to hear questions from data consumers such as:

  1. Can I trust the data I got from Veracity?
  2. How was the data collected and processed?

Read More

Share Comments

Using new domain feng.lu

Shortly after I have renewed my blog domain fenglu.me, it just crossed my mind that “hey, is it possible to register a top-level domain with my family name .lu? So I can literally have my name for my site: feng.lu! That will be cool!”

(picture copyright: www.dreamhost.com)

And, (after googling), yes! It is possible! .lu is the Internet country code top-level domain for Luxembourg. OK… (continue googling) “Can I register a .lu domain without been a Luxembourgers?” “No problem!” Great!

Long story short, after some quick research on vendors and paid 24 Euro, I got the brand new feng.lu domain! :)

The remaining is pretty straightforward:

  • In feng.lu domain provider, set up an apex domain and www subdomain for my real blog host Github page, according to their document.
  • In github page settings, update the custom domain (equals to update the CNAME file).
  • Update blog source code (hexo) with the new domain
  • Important!: Since I would like to keep all existing links from the old domain fenglu.me continue working, I also setup the domain forwarding. Document. Remember to use “Redirect to a specific page/folder/subfolder”.
  • Update Google Analytics, GTM, etc
  • Done!

Happy blogging!

Share Comments

Data Integrity and Lineage by using IOTA

Edit log:

2018.08.26 - Updated the data schema:

  1. Have an unified format that covers both lightweight format and standard format, but more flexible and self-explained.
  2. Specified mandatory fields and optional field in the format. For example, Timestamp is now an optional field.

Introduction

If we say “Data is the new oil”, then data lineage is an issue that we must to solve. Various data sets are generated (most likely by sensors), transferred, processed, aggregated and flowed from upstream to downstream.

The goal of data lineage is to track data over its entire lifecycle, to gain a better understanding of what happens to data as it moves through the course of its life. It increases trust and acceptance of result of data process. It also helps to trace errors back to the root cause, and comply with laws and regulations.

You can easily compare this with the traditional supply chain of raw materials in manufacturing industry and/or logistic industry. However, compares to the traditional industries, data lineage are facing new challenges.

Read More

Share Comments

Running IOTA Full Node

I have been looking at IOTA since last winter, as it seems promising for IoT, Machine-to-Machine Micro-payments and Data Market scenarios.

Installing an IOTA light wallet is pretty straightforward, but running a full node is not. But thanks to the great playbook, I managed to setup a Virtual Private Server to run as an IOTA full node.

  • 2 cores CPU
  • 4 GM memory
  • SSD
  • Hosted 24/7 in a data center in Western Europe

Read More

Share Comments

Infrastructure-as-Code and CI/CD in the real world, with VSTS and Azure (Part 1)

Hello again!

It has be been a while since my last post. It is because I was quite busy leading a team in a program for delivering veracity.com, the open industry data platform from DNV GL. It is a pretty exciting project - to build an open, independent data platform with bleeding edge technologies, to serve a large user base (100 000 registered users). You can read more about veracity at here and here.

It actually is a long and interesting story behind veracity (and its predecessor), together with all challenges that we encountered in this journey. Hopefully I can share them with you in the future.

Anyway, today I would like to talk about in the real world, how Infrastructure-as-Code looks like, together with Azure and VSTS.

Read More

Share Comments

OAuth in Azure AD B2C with Nodejs

Recently we need to build a Nodejs single-page-application (SPA) solution that is using Azure AD B2C as the identity provider (idp). Since it is a single-page-application, we are going to use OAuth2 Implicit Flow.

This article demonstrates the basic steps for setting up both the server side (WebAPI) as well as the client application.

Read More

Share Comments

Using python to organize pictures

Problem

Having several digital cameras is fun: you can have different photography experiences.

However, organizing pictures is far less interesting, especially if you do not have a consistent process (like naming convention) for archiving. After several years, I end up with hundred thousand pictures sitting in messy huge folders:

  • Nikon_Pictures
  • Backup_SDCard01
  • 100_0302
  • DCIM_From_Old_Phone
  • 100CANON
  • Backup-Photo
  • etc…

The most tricky part, is that I have so many duplicate pictures everywhere due to inconsistent archiving during years. It is so messy that I never dare to manually clean them up.

Naturally, the knowledge of programming came to my rescue. This time, it is Python.

Read More

Share Comments

Tracking subdomains with Google Tag Manager

Recently I am investigating how to track user behaviors across our digital services.

We have web applications like:

  1. example.com (the company homepage)
  2. service-A.example.com (digital service A)
  3. service-B.example.com (digital service B)

and we are using Google Tag Manager (and Google Analytics)

Read More

Share Comments

Setup VIM plugin

Time to revisit my VIM plugin system after VIM is upgraded to version 8.0

Previously I was using Vundle but it is bit complicate to set up quickly. This time I am using vim-plug.

Read More

Share Comments