Encoding issue when calling API via powershell

Recently we need to fetch a big dataset from an API via powershell, then import to Azure Data Explorer (ADX).

Problem

1
2
#Used Measure-Command for measuring performance
Measure-Command {curl 'THE_API_END_POINT' | select -expand Content > data.json}

The data.json file looks perfectly fine, but during import to ADX, it reported error “invalid json format”.

Troubleshooting

  1. Using online validation tool such as https://jsonlint.com/, copy & paste the content from data.json. The json objects are valid.

  2. Using local tool jsonlint, reports error. It shows the data.json file has encoding issue.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    PS C:\Users\lufeng\Desktop> jsonlint .\data.json
    Error: Parse error on line 1:
    ��[ { " _ i d " : {
    ^
    Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[', got 'undefined'
    at Object.parseError (C:\Users\lufeng\AppData\Roaming\npm\node_modules\jsonlint\lib\jsonlint.js:55:11)
    at Object.parse (C:\Users\lufeng\AppData\Roaming\npm\node_modules\jsonlint\lib\jsonlint.js:132:22)
    at parse (C:\Users\lufeng\AppData\Roaming\npm\node_modules\jsonlint\lib\cli.js:82:14)
    at main (C:\Users\lufeng\AppData\Roaming\npm\node_modules\jsonlint\lib\cli.js:135:14)
    at Object.<anonymous> (C:\Users\lufeng\AppData\Roaming\npm\node_modules\jsonlint\lib\cli.js:179:1)
    at Module._compile (internal/modules/cjs/loader.js:955:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:991:10)
    at Module.load (internal/modules/cjs/loader.js:811:32)
    at Function.Module._load (internal/modules/cjs/loader.js:723:14)
    at Function.Module.runMain (internal/modules/cjs/loader.js:1043:10)

    Solution

    Switch to a different powershell command solved the problem

    1
    2
    Invoke-WebRequest -Uri 'THE_API_END_POINT' -OutFile data.json

EOF

Share Comments

How to Decrypt Native App's HTTPS Traffic (and Debug for In-app Browser)

Problem with in-app browser of LinkedIn and Facebook iOS apps

Recently our QA reported an interesting issue regarding the native app and our website: When the webpage was shared on Linkedin iOS App and/or Facebook iOS App, the built-in browsers cannot show it correctly but a blank page.

  • This issue only happens on some of the iOS apps (see the list below).
  • Other iOS native apps have no problem.
  • Safari and Chrome for iOS have no problem.
  • All Android-based native apps have no problem.
  • All desktop browsers have no problem.

Read More

Share Comments

Jump-start Kubernetes and Istio with Docker Desktop on Windows 10

Here we will setup a single-node Kubernetes cluster on a windows 10 PC (In my case it is a surface 5 with 16GB RAM). If you are new to docker, feel free to check out Jump-start with docker.
We are going to setup:

  • A single-node Kubernetes cluster
  • Kubernetes dashboard
  • Helm
  • Isito (service mesh, including Kiali)
  • Deployment samples

Read More

Share Comments

Customize social sharing on Linkedin via API

(edited 10.06.2020: Updated how to get User ID as LinkedIn upgraded their endpoints)

Problem:

Nowadays it is pretty common to share articles on social media such as Facebook and Linkedin. Thanks to the widely implemented Open Graph protocol, sharing is no long just a dry url, but with enrich text and thumbnails.

However, there are still some web pages that do not have Open Graph implemented, which significantly reduces the readers’ willingness for clicking it.

In addition, even you introduced the Open Graph tags as a hotfix, some times you will have wait for approximately 7 days for linkedin crawler to refresh the preview caching, as mentioned in linkedin documentation:

The first time that LinkedIn’s crawlers visit a webpage when asked to share content via a URL, the data it finds (Open Graph values or our own analysis) will be cached for a period of approximately 7 days.
This means that if you subsequently change the article’s description, upload a new image, fix a typo in the title, etc., you will not see the change represented during any subsequent attempts to share the page until the cache has expired and the crawler is forced to revisit the page to retrieve fresh content.

Some solutions are here and here, but they are more like a workaround.

Solution:

We can overcome this issue by using linkedin API, which provide huge flexibility for customizing the sharing experiences.

Read More

Share Comments

Data Integrity and Lineage by using DLT, Part 3

Other articles in this series:

Recap

In the second part of this series, we have went though both the detailed technical design that is based on IOTA. Some quick recap are:

  1. Use MAM protocol for interacting with IOTA Tangle.
  2. Defined the core data schema (4 mandatory fields: “dataPackageId”, “wayOfProof” , “valueOfProof” and “inputs”).

Although the core data schema is quite easy to implement, companies and developers might meet some challenges to get started, such as:

  1. Developers need to build the knowledge of IOTA and MAM protocol.
  2. Need to build user interface for data lineage visualization.
  3. Companies most likely need to setup and maintain their dedicated infrastructure (Web server that runs IOTA server code, database, resource to perform Proof-of-Work, connection to neighbor nodes in IOTA network, etc), as public nodes from community are not stable.

Data Lineage Service - an open source application to get you started

We want to address above challenges, and help everyone to gain benefits of data integrity and lineage. Therefore, we have built “Data Lineage Service“ application. Developers and companies can apply this technology without deep understanding of IOTA and MAM protocol. It can be used either as a standalone application, or a microservice that integrates with existing systems.

The key functions are:

  • Wrapped IOTA MAM protocol to well-known HTTP(s) protocol as standard Restful API, with swagger definition. Developers do not need to worry about MAM protocol and it technical details, but focus on the normal data pipeline.
  • Reusable web interface for lineage visualization.
  • User-friendly interface for submitting data integrity and lineage information to DLT.
  • Built-in functionalities for addressing common issues such as caching and monitoring.
  • It is open-sourced on github with MIT license.

Read More

Share Comments

Data Integrity and Lineage by using DLT, Part 2

Other articles in this series:

In my previous article, we discussed different approaches for solving the data integrity and lineage challenges, and concluded that the “Hashing with DLT“ solution is the direction we will move forward. In this article, we will have deep dive into it. Please not that Veracity’s work on data integrity and data lineage is testing many technologies in parallel. We utilise and test proven centralized technologies as well as new distributed ledger technologies like Tangle and Blockchain. This article series uses the IOTA Tangle as the distributed ledger technology. The use cases described can be solved with other technologies. This article does not necessarily reflect the technologies used in Veracity production environments.

Which DLT to select?

As Veracity is part of an Open Industry Ecosystem we have focused our data integrity and data lineage work using public DLT and open sourced technologies. We believe that to succeed with providing transparency from the user to the origin of data many technology vendors must collaborate around common standards and technologies. The organizational setup and philosophies for some of the public distributed ledgers provides the right environment to learn and develop fast with an adaptive ecosystem.

Read More

Share Comments

Data Integrity and Lineage by using DLT, Part 1

Other articles in this series:

Introduction

With the proliferation of data – collecting and storing it, sharing it, mining it for gains – a basic question goes unanswered: is this data even good? The quality of data is of utmost concern because you can’t do meaningful analysis on data which you can’t trust. Here in Veracity, we are trying to address this is very concern. This is a 3 part series, going all the way from concept to a working implementation using DLT (Distributed Ledger Technology).

Side note, Veracity is designed to help companies unlock, qualify, combine and prepare data for analytics and benchmarking. It helps data providers to easily onboard data to the platform, and enable data consumers to access and mine value. The data can be from various sources, such as sensors and edge devices, production systems, historical databases and human inputs. Data is generated, transferred, processed and stored, from one system to another system, one company to another company.

Veracity is by DNV GL, and DNV GL has held a strong brand for more than 150 years as being a trusted 3rd party, yet it is still pretty common to hear questions from data consumers such as:

  1. Can I trust the data I got from Veracity?
  2. How was the data collected and processed?

Read More

Share Comments

Using new domain feng.lu

Shortly after I have renewed my blog domain fenglu.me, it just crossed my mind that “hey, is it possible to register a top-level domain with my family name .lu? So I can literally have my name for my site: feng.lu! That will be cool!”

(picture copyright: www.dreamhost.com)

And, (after googling), yes! It is possible! .lu is the Internet country code top-level domain for Luxembourg. OK… (continue googling) “Can I register a .lu domain without been a Luxembourgers?” “No problem!” Great!

Long story short, after some quick research on vendors and paid 24 Euro, I got the brand new feng.lu domain! :)

The remaining is pretty straightforward:

  • In feng.lu domain provider, set up an apex domain and www subdomain for my real blog host Github page, according to their document.
  • In github page settings, update the custom domain (equals to update the CNAME file).
  • Update blog source code (hexo) with the new domain
  • Important!: Since I would like to keep all existing links from the old domain fenglu.me continue working, I also setup the domain forwarding. Document. Remember to use “Redirect to a specific page/folder/subfolder”.
  • Update Google Analytics, GTM, etc
  • Done!

Happy blogging!

Share Comments

Data Integrity and Lineage by using IOTA

Edit log:

2018.09.25
This article is now expanded to an article series, where we have more detailed discussion and open-source code, check them out!

2018.08.26 - Updated the data schema:

  1. Have an unified format that covers both lightweight format and standard format, but more flexible and self-explained.
  2. Specified mandatory fields and optional field in the format. For example, Timestamp is now an optional field.

Introduction

If we say “Data is the new oil”, then data lineage is an issue that we must to solve. Various data sets are generated (most likely by sensors), transferred, processed, aggregated and flowed from upstream to downstream.

The goal of data lineage is to track data over its entire lifecycle, to gain a better understanding of what happens to data as it moves through the course of its life. It increases trust and acceptance of result of data process. It also helps to trace errors back to the root cause, and comply with laws and regulations.

You can easily compare this with the traditional supply chain of raw materials in manufacturing industry and/or logistic industry. However, compares to the traditional industries, data lineage are facing new challenges.

Read More

Share Comments

Running IOTA Full Node

I have been looking at IOTA since last winter, as it seems promising for IoT, Machine-to-Machine Micro-payments and Data Market scenarios.

Installing an IOTA light wallet is pretty straightforward, but running a full node is not. But thanks to the great playbook, I managed to setup a Virtual Private Server to run as an IOTA full node.

  • 2 cores CPU
  • 4 GM memory
  • SSD
  • Hosted 24/7 in a data center in Western Europe

Read More

Share Comments