Return from Sabbatical

Alan Viars, founder and president of Videntity, has completed his appointment as an HHS External Entrepreneur at the Centers for Medicare and Medicaid Services (CMS) and has returned to Videntity full time.


President’s Sabbatical

Alan Viars, founder and president of Videntity, is on sabbatical from Videntity until Spring 2015 during his appointment as an HHS External Entrepreneur at the Centers for Medicare and Medicaid Services (CMS). He is working on an open-source redesign and modernization of the National Plan and Provider Enumeration System (NPPES). NPPES is best known as the system that issues National Provider Identifiers (NPIs). More information about this effort cat be found at


Certificate Authority Management System for Direct

The management software that runs the certificate authority (CA) at, or more precisely, is now open source. “vcert” is a web-based application written in Django which relies on OpenSSL for certificate creation and management. is still free to use, but if you need operate your own CA or a registration authority (RA), then this system provides a simple web-based interface for doing so. This software was originally designed to facilitate testing of the Direct Project.

The source code can be found here on GitHub. See the README for more information on installation and operation.


Converting Stata files to CSV using R

A simple recipe for converting Stata data into CSV. CSV stands for comma separated valuew. Common uses for this recipe is when you want to move information from Stata into a spreadsheet or another database. In this example, we assume in Stata input file name is called “MyStata.dta” and the resulting CSV file name is “MyStata.csv”.
Here is the process.

Install R (if its not already installed)

$ sudo apt-get install r-base-core

Now run R from the folder where your Stata (.dta) files lives.

  $ R

You should see something like this:

R version 2.14.1 (2011-12-22)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


Import the foreign library

> library(foreign)

Import the Stata data

> MyData <- read.dta("MyStata.dta")

Now write the information out to a CSV file using write.csv()

write.csv(MyData, file = "MyStata.csv")

Quit R

> q()

Now you have your CSV file.


Provider Directory Proposal (NPPES Redux)

A couple of weeks ago, I attended ONC’s Direct Bootcamp in Crystal City, VA. A hot topic at the two-day conference was the notion of a “Provider Directory” that incorporates Direct email addresses.

I also read that HHS/CMS intends to revamp the National Plan and Provider Enumeration System (NPPES). This is the system that manages National Provider Identifiers or (NPIs). Every individual provider and provider organization has one of these numbers, sort of like a tax ID for providers. A common complaint I hear is that it contains information that is often out of date and/or incorrect.

So what, you might ask, does the NPPES have to do with the Direct Project? Having worked with the NPPES data and having some background with Direct, the idea of “killing two birds with one stone” has captured my imagination. (Nerdy and wonky I know.) This is an opportunity for government efficiency by consolidating systems. Efficiency can only be achieved if the new system is simple, however. Too often in health information technology, consultants and vendors introduce complexity for complexity’s sake. After all, complexity is good for the bottom line for many companies because it means more billable hours and more services sold. Sadly, I see this sort of thing all the time. As an American and a taxpayer it ticks me off.(See footnote)

To illustrate what I mean by “simple”, I’ve built a prototype web service application that illustrates my vision of a combined NPPES and Direct email Provider Directory. Before I outline that technical proposal, however, I’d like to point out how adding some other data fields to NPPES could result in a an empowering service for patients, providers, and payers.

Adding Other Data Points to NPPES

While a full NPPES would involve adding more fields and require pagination, which I have not included here, items that I think contain more fields and would require pagination that is not illustrated here, items that I think would be most useful to add include:

1. Diagnosis and Procedure Codes such as ICD9, ICD10, and CPT codes that are typically provided by the provider.

2.Payer and health plan identifiers to make it easy easy to search and determine if a provider takes a particular insurance plan, for example.

3.State Medical Board License information to determine in which state(2) the provider is licensed.(An even better idea is for our nation to adopt a national medical board and do away with state medial boards altogether.)

An NPPES Web Service Proposal

This is my proposal for a simple NPPES web service in a RESTFul style. I have provided examples and defined some basic HTTP request and response expectations.

Here is an example of a simple search query looking for all doctors named “Fred Smith” that have a practice within the zip code 20004. We simply use the any web client to query the database. In this example we return the data in JSON, but we could also just as easily return it in XML, CSV, or HTML.

 GET http://localhost/nppes/example.json?first_name=Fred&last_name=Smith&zip=21223

If any results are found, we get back an HTTP 200 response and a JSON file containing a non-empty list of results.


    "message": "OK",
    "num_results": 1,
    "results": [
            "first_name": "Fred",
            "last_name": "Smith",
            "npi": "23456789",
            "address_1": "901 Pennsylvania Ave",
            "address_2": "",
            "city": "Washington",
            "state": "DC",
            "zip": "20004",
            "telephone": "202-555-5555",
            "fax": "202-777-7777",
            "provider_type": "",
            "regular_email": "",
            "direct_emails": [
                    "organization": "Hope Hospital",
                    "npi": "3453456985",
                    "email": ""
                    "organization": "Fred Smih MD",
                    "npi": "23456789",
                    "email": ""


Note that within the results is a field called “direct_emails”. We assume each provider could have many Direct addresses, for example, if he or she works at multiple organizations. This field maps all other NPIs and Direct addresses together.

We can also query by a Direct address…..

GET http://localhost/nppes/example.json?

…and we can also query by an NPI…..

GET http://localhost/nppes/example.json?npi=23456789

The above two example returns the same result as before. So we can query by name, address, provider type, etc. and we can also query just by a Direct email address or an NPI.

For cohesiveness, I’d like to outline what things look like when no results are returned (i.e. an unhappy path). If no results are found or some sort of error occurs, then the NPPES web service responds with something other than an HTTP 200 status. Here are two unhappy examples; A valid query returning no results, and an invalid query.

No Results


GET http://localhost/nppes/search/?first_name=Fred&last_name=Appleseed


    "message": "No search result matched your query.",
    "results": [ ]

The HTTP response code is 404.

Invalid Query


GET http://localhost/nppes/search/?foo=bar



    "message": "You supplied an invalid search parameter: foo",
    "results": [ ]


The HTTP response code is 400.

Next Steps

This blog post is an open letter to HHS/CMS on how to construct a new NPPES without the complexity that often accompanies health IT. Comments and feedback are welcome.

Resources and Background NPPES

Currently the NPPES data is made available as a comma separated value (CSV) file. The field headers/names are in a separate CSV file. This URL is not on a .gov domain and is somewhat hard to find. I’ve published a link to it here. Thanks Fred Trotter. The sign up and update is not an electronic process. Here is a link to the PDF sign-up/update form. Almost certainly any NPPES modernization effort would involve making this an online process.

Footnote: John Stewart made this point well in his segment “The Red Tape Diaries – Veterans Benefits” (VIDEO). At Videntity, we subscribe to the mantra that building things as simply and as efficiently as possible is always the best design choice, even if it means a smaller contract.


RESTFul Direct Certificate Discovery

Direct is a health information exchange framework based on Public Key Infrastructure (PKI) and email. Public x509 certificates must be discoverable for Direct to work properly. Currently Direct uses both DNS and LDAP for this purpose. This blog post outlines a proposal for a new RESTful method of Direct certificate discovery.

The Direct Applicability statement already provides two methods to discover certificates: DNS, and LDAP. Since DNS and LDAP are easy enough to query, why do we need a third method? In a word, simplicity. If the use of a technology is going to be mandated by the government, as in the case of Direct via Meaningful Use, shouldn’t it be simple, inexpensive, and as friction-less as possible? We’re living in a web world so why not leverage this commodity. I’ll be the first to admit that it can be hard to reduce complexity by adding requirements. We should only add requirements if they are needed. I’d like outline some of the issues with the DNS and LDAP approaches.

Issues Serving Direct Certificates via DNS and LDAP

1. DNS does not work on many large networks. On many networks including Comcast, Time Warner, and others, looking up a Direct certificate via DNS is not possible because these networks do not allow large DNS lookups. The certificates are too large and are blocked on these networks.

2. No large hosting provider (including Amazon, GoDaddy, Yahoo, DreamHost, Google, and others) provide support for certificate “CERT-type” records. Hosting your own DNS server is complicated and burdensome. Most organizations leave this to their hosting provider, however, if you want to serve certificates for Direct via DNS you have no choice but to host your own DNS server or contract out to a third party service.

3. Some security-minded people may frown on the idea of anonymous access to an LDAP server, especially if it also contains other resources with restricted access. In addition, LDAP is also and burdensome to setup under the Linux/Unix.

The first item in the list is going to cause a lot of problems. See this for yourself using the command-line tool “dig”. (You could also use “nslookup”). Try typing the following:

> dig CERT

Depending on your network, you may or may not get a response. For example, on Time Warner or Comcast you might get this instead of the certificate.

;; Truncated, retrying in TCP mode.

; <<>> DiG 9.8.1-P1 <<>> CERT
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 3004
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0


;; Query time: 1 msec
;; WHEN: Thu Aug 22 13:45:31 2013
;; MSG SIZE  rcvd: 43

If you try the same command on another network (from an Amazon EC2 instance for example), you will get the certificate. DNS is working as expected but our network is blocking it. Since many providers throughout the United States use Time Warner, Comcast, and others as Internet Service Providers (ISPs), this presents a problem. If an organization wants to host Direct and lookup certificates via DNS, then their network could not.

It may require someone, such as the US government, to require all major ISPs to change their network configuration to support Direct (i.e. large DNS responses).

I assume that is possible and has already happened in many cases, but as I said before its a web world. There’s got to be a better way! Below is my proposal for serving Direct certificates in a simple RESTFul fashion. I call it RCD for “RESTFul Certificate Discovery”. Here are the details.

RESTFul Certificate Discovery

RCD Syntax


                              |-- Email or Domain part --|

RCD Protocol Specification

RCD works by ensuring the URL is predictable. The RCD service MUST operate on a sub-domain named “rcd” on “your-domain”. The “your-domain” in the email/domain part MUST match in the email part. This ensures the system remains distributed among organizations just like DNS. The RCD service ALWAYS returns the public certificate in “.pem” (Privacy Enhanced Mail) format and ALWAYS with an HTTP status code of 200. If the certificate is not found or does not exist then RCD response SHALL ALWAYS return some HTTP response code other than 200 (e.g. 404, 4xx, 5xx). RCD SHALL ALWAYS operate over HTTP on port 80 or port 443 for HTTPS. A RCD client MUST ALWAYS check the HTTPS URL first and then the corresponding HTTP URL. How certificates get into RCD is explicitly left undefined and is left to the implementer.

RCD Example:

RCD is just an HTTP GET request to the URL “”

Here is another domain-bound example.

The RCD service responds with the .pem formatted certificate file or a non-200 HTTP status (404, 400 etc.).

That’s it. That’s the entire specification. New approaches like RCD are specifically allowed in the current Direct spec:”Direct Project solutions MAY obtain digital certificates through some other out-of-band and thus manual means”.

Discover a Certificate via RCD

What’s this look like in action?. We will use a web client called “wget” that is already installed on Mac and most Linux flavors. You could also use a web browser, curl, or any number of web clients.

> wget
> cat
...removed for brevity...

We use “wget” to fetch the certificate at the predictable URL. Then we use “cat” to display the contents of the file that was downloaded.

Hosting RCD

So we see how simple RCD is to query, but what about hosting? This is really the best part to this approach – it really couldn’t be easier. Here’s why:

1.RCD can be implemented with just about any web server (Apache, Microsoft IIS, NGINX, etc).

2. RCD is simple enough to be implemented within a content delivery network (CDN) such as Amazon S3 or Rackspace Cloud Files. Take Amazon S3 for example. S3 Storage is inexpensive and redundant. What this means is that even a fairly hefty deployment would likely only run pennies a month or few dollars per year. Setup is simple because all you need to do is place your files in a sub-directory called “read” and you’re done. No code is necessary to implement this protocol. Its just a static file at a predictable URL.

3.RCD does not break on some networks as described above in “Issues serving Direct certificates via DNS and LDAP”.

Technical Note:To make RCD work with S3, you need to setup a “vanity URL” on the S3 bucket and create a CNAME within your DNS provider’s (e.g. GoDaddy, Yahoo, etc.) configuration. You will need the authority to make changes to DNS for the domain .

Drawbacks, Conclusions, and Next Steps

The key drawback to this approach is that it requires existing Direct implementations to add the functionality to query for certificates in this way. I’d estimate this would require about 50 lines of code in Java and would be very straightforward. Another drawback is that there isn’t a standard RFC to point to when it comes to REST/RESTFul approaches. That being said RESTful approaches are usually much easier to use and this is why they have largely supplanted more complicated SOAP services.

In my humble opinion, the juice is worth the squeeze. RCD will result in a solution that is orders of magnitude simpler and more cost effective than current options. I am writing this blog post as an open letter to the Direct community hoping that this approach will receive consideration. Thoughts? Opinions? Suggestions?



How to Serve Public Certificates with BIND for the Direct Project

Serving public certificates via DNS is quite an obscure endeavor. This is made quite evident by the total lack of documentation on how to serve CERT type records using BIND.The Direct Project requires that public certificates are discoverable via DNS and LDAP. BIND is by far the most widely used nameserver on the Internet and so this article describes how to get BIND to serve certificates for a Direct implementation.

Adding the certificate to BIND’s zone file is the tricky bit. You must do two things to the certificate so that BIND will serve it correctly. 1.) Calculate the “key tag” and 2.) format the public key correctly. It is not just the contents of a .pem file. The the certs’s zone file entry must be all on one line.

To make this task easier we have created a free open-source command line utility called “BIND Certificate Converter” or just “bcc” for short. bcc accepts two command line parameters; 1.) a host name and 2.) a certificate file (in .pem format). bcc outputs a suitable BIND zone file entry. You can redirect the output of this utility to append to your zone file. We will illustrate this with an example bewlow. In this example, we assume you are using BIND 9 on Ubuntu. The utility BIND Certificate Converter “bcc” also requires that dnspython and openssl are installed. Let start by making sure we have everything we need installed.

sudo apt-get install python-dnspython openssl bind9

Copy the file “bcc” into “/usr/bin” and set it to executable. The source code is at the bottom of this blog entry.

sudo cp bcc /usr/bin
sudo chmod 755 /usr/bin/bcc

We have now installed BIND, bcc, and the other prerequisites. You will now have the directory /etc/bind. Lets change “cd” into that directory. The file we want to change for localhost is “/etc/bind/db.local”. We are assuming you have a file called “” sitting in your $HOME directory (/home/ubunutu/

Now we can give our utility bbc (source code below) a dry run before outputting anything.

bcc examplehost

Let’s make things easier and just switch to a root shell.

sudo su -

Now lets write the output of bcc to the end of “/etc/bind/db.local”

cd /etc/bind
bcc examplehost /home/ubuntu/ >> /etc/bind/db.local

Now we need to restart BIND for our change to take effect.

sudo /etc/init.d/bind9 restart

Now we can test it with “nslookup”. Execute the following command:

nslookup -type=CERT myexamplehost.localhost localhost

The BIND server responds:

Server:		localhost

myexamplehost.localhost	cert = PKIX 12437 RSASHA1 MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzXVb5YJHmV7a ljGA2KHXPgJ3KmGkMCF9AIHOGsnNN1CuuN4gCRdPVhySVsyEjWLj2xby 6i7yc8Zau2AutrFEFBibXw1YQZvbzabxpG0zZV3tG88t+03OH2VJsK2t 5adxY8wufuY353NwiCMhLtsnRMMym9BbLqQWt3v1P+s9zqq1bLQYQYJC ZexUVhBnjEEVL5oschErtoahpRlmhE1LxtmxKr75mv8RfZV17Pbn7JbP Jk36wpFKpT9SGJWC27eqUFtorOOkH6Kr+j/fGs1GWKgXjMZpeADC14Yh KrDeJtpUL3zzUtsLN9nP/MbcCzHnwdRd4Sb+5V0K1S3R/vtrDQIDAQAB

It works (hopefully)! Here is the source code to bcc.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# vim: ai ts=4 sts=4 et sw=4
# Copyright 2013 Videntity
# Freely reuse under the terms of
# Last Updated: August 18 , 2013

# BIND Certificate Convert - Read in a host name and PEM certificate and write out the corresponding line of BIND's zone file.

import os, sys, hashlib
import subprocess
import dns.rdata
import dns.dnssec
import dns.rdataclass
import dns.rdatatype
import dns.rdtypes.ANY.DNSKEY

def find_between( s, first, last ):
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""
def bind_cert_convert(hostname, pemcert):
    output,error = subprocess.Popen(["openssl", "x509", "-in",
                                      pemcert, "-pubkey","-noout",],
                             stderr= subprocess.PIPE
    clean_pk = find_between(output,
                            "-----BEGIN PUBLIC KEY-----",
                            "-----END PUBLIC KEY-----")
    clean_pk = clean_pk.replace("\n", "")
    decoded_clean_pk = clean_pk.decode('base64', strict)
    dnskey = dns.rdtypes.ANY.DNSKEY.DNSKEY(dns.rdataclass.IN,
                                           dns.rdatatype.DNSKEY, 0, 0,
    bind_entry = "%s\tIN\tCERT\tPKIX %s RSASHA1 %s" % (hostname,
    print bind_entry
if __name__ == "__main__":
    if len(sys.argv)!=3:
        print "Usage: [HOST_NAME] [PEM_CERT_FILENAME]"
    bind_cert_convert(sys.argv[1], sys.argv[2])

So now there is documentation! I’d like to give a special thank you to Bob Halley, author of the “dnspython” library for helping me figure this out.


Direct Certificate Authority –

Over the past few months Videntity has built a robust, fully-functional certificate authority designed specifically with the Direct Project in mind. (You can find the Direct Project wiki here). The original motivation behind building “The Direct Certificate Authority (CA)” or just “Direct CA” was facilitating Meaningful Use stage 2 testing and specifically health information systems’ compliance with the Direct Applicability Statement. One requirement in particular, testing revocation, required the use of an actual certificate authority that managed CRLs. It also turns out asking for a certificates to support Direct is a tall order with most CA’s because most are unfamiliar with the unique requirements and nuances of the Direct Project. Various CA’s often handle revocation differently, further complicating matters. The other freely available tool for building X509 certificates that will work with Direct is called “certGen”. certGen is a good tool, but lacks support for revocation (i.e. Certificate Revocation Lists (CRLs) and/or OSCP). Hence, Direct CA was born.

Direct CA is a web-based tool so there is no software to install. Its designed around the notion of “Trust Anchors” whereby a “Trust Anchor” acts like a miniature Certificate Authority. Subordinate (i.e child) email-address-bound and domain-bound certificates are created with the Trust Anchor as the parent. Direct CA also publishes all public certificates to the web automatically in common certificate formats (.pem and .der). Certificate Revocation Lists (CRLs) are generated and published on a per-trust-anchor basis get automatically updated every few hours.

Anyone may use Direct CA for free to create certificates for testing purposes. If interested, simply request an invitation code.. We hope this tool makes Direct development and implementation a little easier.

Here are answers to commonly asked questions.

Q: Can I use this software to manage my own organization’s certificate authority or HISP / Trust Anchor)?

A: Yes. Contact sales AT videntity dot com or complete the contact form for more information.

Q: Is Direct CA open source?

A: No. The service is free, but the source code is not public. Contact usfor more information on our shared source options.

Q: I see reference to a file x5c file. What’s that all about?

A: Unlike how certificates work within web browsers, with Direct the Applicability Statement has no requirement to check the certificate chain back to the root CA’s certificate. Direct only requires checking the chain back to the trust anchor. Its debatable on whether or not this is a security issue, but DirectC A creates and publishes the full chain in a convenient JSON format just in case you DO wish to check the validity of the entire chain. The “x5c” file is an “X509 certificate chain” file in JSON format. The format complies with the IETFdraft for JSON Web Key (JWK) for Public Key Infrastructure (X.509) (PKIX) Certificates. Thanks to Josh Mandel for suggesting basing this feature of the work by the Javascript Object Signing and Encryption (JOSE).

DISCLAIMER: Use of this tool by government organizations does not imply recommendation or endorsement. DirectCA is for testing purposes and is provided “as-is” without warranty.


Redirect to another URL in Django urls

Things have changed a bit in Django 1.5 with respect to how redirects work without writing a view. In Django 1.5, you must use “RedirectView”, but what may not be too obvious is that “reverse” does not work with “RedirectView”. You must to use “reverse_lazy” instead. I couldn’t find a complete example of this anywhere and hence it prompted me to write this post and answer this question on Stack Overflow. Hope this is helpful to someone.

Example using RedirectView and reverse_lazy:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# vim: ai ts=4 sts=4 et sw=4
# Copyright 2013 Videntity  
# Freely reuse under the terms of
# Last Updated: Aug 7 , 2013 

from django.conf.urls import patterns, include, url
from django.views.generic import RedirectView
from django.core.urlresolvers import reverse_lazy

# Uncomment the next two lines to enable the admin:
from django.contrib import admin


urlpatterns = patterns('',
    url(r'^$', 'home', name='website_home'),
    url(r'^redirect-home/$', RedirectView.as_view(url=reverse_lazy('website_home')), name='redirect_home'),


So in the above example, the url “/redirect-home” will redirect to “/”. Hope this helps.


ce – Certificate Examiner

This is a quick command-line utility for displaying information contained in an SSL certificates and CRLs. It is faster and easier to remember than the OpenSSL commands on which it is based. It allows you specify only a certificate’s filename. It guesses the certificate’s type based on the filename extension. This works with most pem, der, and p12 formatted files using common extension conventions.


> ce


        Version: 3 (0x2)
        Serial Number: 4861619212740627522 (0x4377f0ea809e8c42)
    Signature Algorithm: sha1WithRSAEncryption


OpenSSL is a prerequisite. Execute the following commands to ensure that “ce” is on your path and that it executes without invoking Python directly.

> sudo cp ce /usr/bin 
> sudo chmod 755 /usr/bin/ce

Source Code for ce

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# vim: ai ts=4 sts=4 et sw=4
# Copyright 2013 Videntity  
# Freely reuse under the terms of
# Last Updated: September 10 , 2013 

import os, sys
def certificate_explain(certfile):
    tmplower = certfile.lower()
    if tmplower.endswith("der"):
        certype = "der"
    elif tmplower.endswith("pem") or tmplower.endswith("crt"):
        certype = "pem"
    elif tmplower.endswith("p12") or tmplower.endswith("pfx"):
    elif tmplower.endswith("crl"):
       print "Unrecognized file extension.  The file must end in der, pem, crt, p12, pfx, or crl."
    if certype in ("pem", "der"):
        shellcmd = "openssl x509 -in %s -inform %s -noout -text" % (certfile,
    if certype == "p12":
        shellcmd = "openssl pkcs12 -info -in %s" % (certfile)
    if certype == "crl":
        shellcmd = "openssl crl -in %s -noout -text" % (certfile)
if __name__ == "__main__":
    if len(sys.argv)!=2:
        print "Usage: [CERT_FILENAME]"