Blog

Salt Your Passphrase

TL; DR

  • Passphrase length is greatest contributor to strength, with memorability is preferable to complexity, but predictability is greatest detractor to strength.
  • Dictionary-based attacks use whole words to reduce cycles needed for brute force attacks.
  • Add punctuation to turn words into non-words (eg salting) to vault your passphrase to the next level.

Baselining

We have seen that Cloud-backed Password Vaults such as 1Password , LastPass, Dashlane, etc make it easy to store all of our passwords in a single, secure, wallet that is automatically synchronized across all your devices. With these tools we have a method to keep all our passwords in a single place that is secure. The very definition of a High Value Target. Something we must be very diligent in protecting. Where 1Password, et al allow us to use non-predictable, unique, long, and complex passwords in our everyday lives by removing the memorable requirement, the master password for the wallet itself must be memorable.

Let’s review what makes a good master password for your password wallet.

  1. It must be memorable so we don’t have to write it down for daily use.
  2. It must be easy to type, since keyboards on mobile devices are harder to use.
  3. It must be long so that it takes a very, very, very, long time to crack.
  4. It must be non-predictable so a lucky guess can’t undo all of our hard work.

My own password for my password vault is not a password at all, but an 8-word phrase. It measures between 45 and 60 characters long (I won’t tell you the actual length of course), it is very memorable to me (something I memorized during my elementary school age) so I will never have to write it down.  By some measures it’ll take more than one hundred thousand trillion trillion trillion trillion trillion trillion centuries to crack my passphrase. This is all before I applied the technique that I will cover in this blog.

Technique

For teaching this technique of salting your passwords, and observe its benefits I will use the cracking algorithm that the site https://useapassphrase.com uses to estimate password strength. When I visited their site, they suggested 4 simple words chosen at random; shaw outlet fence butler. I suggest that you use more than just 4 words for your master passphrase, but let’s step through this simpler phrase.

Our baseline is thus a 4-word passphrase, with spaces, totaling 24 characters in length which is estimated to require 664 centuries to crack. That’s no slouch.

shawoutletfencebutler1

Let’s see what happens if we add a bit of complexity, just a bit. Can we get outsize returns? If we add a capital letter we increase Crack Time by 6x to about 3,900 centuries. Nice.

shawoutletfencebutler2

Okay this is fun. Let’s play with it a bit. What if we swap in a special character?

shawoutletfencebutler3

We get 3.1 million centuries, or 4,674x increase. Awwright! We are starting to get someplace. But maybe we’re getting too carried away. It is harder to crack but not very memorable, so let’s take a step back. Our primary goal for the master password is that it be memorable first, non-predictable and long.

Let’s choose to put the special character at the end.  Remembering add a dash at the end can’t be too hard to remember. Does it do anything for us?

shawoutletfencebutler4

That’s not bad return for a single character. From 664 to 39,230 centuries is a 59x increase in entropy. But moving the special character around doesn’t yield any more entropy gains, as we can easily see below.

shawoutletfencebutler5

That is until we make one of these words, NOT BE A WORD anymore.

shawoutletfencebutler6

Breaking up the word gives you 66 million centuries or 99460x added entropy over the original length or 1683x over the longer phrase. BINGO! I think we might be onto something.  What if make the word into a non-word by deliberate misspelling?

shawoutletfencebutler7

Even if you take back the dash, which reduces both the length and number of dictionaries needed to include in the cracking algorithm but preserve the not a word feature in your passphrase you still get significantly added entropy. 5.2 BILLION centuries or 7,868,692x added entropy on top of passphrase that is considered somewhat strong to begin with.

Conclusion

In short, use words for your memorable passphrases, but if you want to be super-duper extra mega, you will salt your password by throwing a mispelling [sic] in for good measure. The passphrase you would remember is shaw misspelled outlet fence and butler which is not all that much more to remember for a ciritical [sic] password as the password for all the others

DNS-based web metrics collection mechanism

For the 30 years that the worldwide web has existed people have sought to understand and classify the behavior of end user. From initial harvesting of web server logs, to the invention of beaconing web bugs, industry has been trying to extract analytics and perform data analysis. The current state in industry is that there is a plethora of competing collectors with each part of the enterprise selecting their own tracking mechanism. This leads to substantial overhead and even resource exhaustion. We need a mechanism that will reduce the impact of collecting web analytics, both in terms of page load times and cookie proliferation.

The W3C has added the ping attribute to the <A> tag to instruct the browser to simultaneously load the HREF target and POST to the PING target, which ameliorates the impact of redirect chains. It is however dependent on the User Agent to implement it, which is not a foregone conclusion that it is widely implemented.

Security-conscious practitioners know very well that the DNS protocol can be used to quietly exfiltrate data from a protected network. The curious reader may consult the DNS Tunneling technique  (here).

What if we were to combine these things? What if we were to somehow leverage the DNS Tunneling technique to collect web metrics. We could alternatively replace the current method or augment the current methods.

The web metric data is encoded within a purposely written DNS Query packet –eg the mechanism used for DNS Tunnelling— for transmission to collector instantiated as Authoritative DNS server. We show several methods, and embodiments, to perform this type of white-hat data exfiltration to provide performance and reliability gains from other existing methods.

How it would work

  • Script on client’s side takes all the metrics data to be sent and create a string out of them. Also adds verification string as well as a nonce string.
  • Data are encoded to produce a string base64, if the data string is too long, it may be compressed using common compression techniques such as gzip, or bzip2.
  • The client requests this host from a subdomain (foo) on a domain (example.org), where host is equal to generated string. The fully qualified domain name would be a1b2c3d4e5f6g7.foo.example.org in this examplar.
  • DNS resolution of the name would carry through to the Authoritative DNS server for the example.org domain, which decodes base64 and optionally verifies data validity (looks for predefined string or compares the checksum). If data validation is not turned on, it is assumed that data are correct.
    • If the data are correct, responds with “does not exist” or IP address of classic collector, depending on the embodiment (replacement vs augmentation).
    • If the data are incorrect, responds with bogus IP address (eg. 172.172.172.172)

 

In some embodiments the method can behave as a pass-through proxy or as an exploding proxy. In these cases, response to client is instant and Authoritative DNS server is communicating with Pixel collector(s) and/or Data Processing Service(s) after user receives the response.

  • The client would load the image response from Pixel Collectors, no IP address or fail instantly.
  • The Authoritative DNS server and optionally the pixel collector would then extract the transmitted data from their logs and pass the data to a service designated to collect and process metrics. It is important to note that an extension to this embodiment may have the ADNS device post metrics to more than one collector, thus behaving as an broker for multiple collecting agencies.

Embodiments

Let me walk you through a few embodiments of this idea might work through an example of sending the value “uniqueID=userID” for the “foobar” page to a pixel server at example.org. In other words, the web client would be loading “http://example.org/uc.gif?pageData=foobar” to transmit the “uniqID=UserID” cookie value to the collector.

As a replacement capability of the existing web metrics collectors

  • DNS resolution of the name would carry through to the Authoritative DNS server for the example.org domain, which decodes base64 and optionally verifies data validity (looks for predefined string or verifies checksum). If the optional data validation is turned off, it is assumed that the data are correct.
    • If the data are correct, responds with no IP address, eg “does not exist” message.
    • If the data are incorrect, responds with bogus IP address (eg. 172.172.172.172)
  • The client would fail in its attempt to load the resource in 2 possible ways – if the data validation was successful, no ip address, otherwise a bogus IP address. This would be an immediate failure, thereby freeing the client to continue processing other directives in the loaded page.
  • The Authoritative DNS server would then process the query logs to extract the transmitted data and pass the data to the analysis component of a web analytics service.

 

A flow chart of the above might look like this

As an augmented capability to the existing web metrics collectors

  • Combine “pageData=foobar” and “uniqID=userID” tokens into a single string, eg “pageData=foobar;uniqID=userID”.
  • Add predefined string (or checksum) into tokens string, eg “pageData=foobar;uniqID=userID;%DNS_COLLECTOR_V1%”.
  • Use base64 encoding to produce a string, eg “cGFnZURhdGE9Zm9vYmFyO3VuaXFJRD11c2VySUQ7JUROU19DT0xMRUNUT1JfVjEl”.
    If the data string is too long, it may be compressed using common compression techniques such as gzip, or bzip2.The client would request http://cGFnZURhdGE9Zm9vYmFyO3VuaXFJRD11c2VySUQ7JUROU19DT0xMRUNUT1JfVjEl.example.org/. However should the data be compressed the client would request http://cGFnZURhdGE9Zm9vYmFyO3VuaXFJRD11c2VySUQ7JUROU19DT0xMRUNUT1JfVjEl.compressed.example.org/.
  • DNS resolution of the name would carry through to the Authoritative DNS server for the example.org domain, which decodes base64 and optionally verifies data validity (looks for predefined string or verifies checksum). If the optional data validation is turned off, it is assumed that the data are correct.
      • If the data are correct, responds with no IP address, eg “does not exist” message.
      • If the data are incorrect, responds with bogus IP address (eg. 172.172.172.172)
  • The client would load the uc.gif resource thereby transmitting the metrics to the classic collector as well as the DNS-based collector or fail instantly when incorrect data sent.
  • The Authoritative DNS server would then process the query logs to extract the transmitted data and pass the data to the analysis component of a web analytics service in parallel to a classic HTTP-based collector’s metrics analysis.

A flow chart of the above might look like this

Further thoughts

Suppose that that the program resident of the Authoritative DNS server which reads the query logs extracts the data and performs POST back to a classic HTTP-based pixel collector. There would now be 2 POSTs for each metric, unless something happens. However, maybe there’s an additional component that expands the receipt and processing of this one signal to many collectors. By configuration the same process could notify a plurality of collectors from this one signal sent by the client.

 

 

 

 

Area lights around the SUVRV

IMG_20190512_141939054_HDR

 

Using solar post lights for area lights in a car-camping situation is inspired by the fence-post lights currently being used in rural settings as a set-it & forget-it proposition

urpower_sl-002

There are many places where these can be sourced, but I sourced mine from Amazon.

The transformation needed to turn these into camp lights in car-camping situations is the addition of magnets. I chose to add four  D901 magnets sourced from K&J Magnetics thinking that the stated pulling power of 1.3 lbs each (4 x 1.3 = 5.2 lbs) should provide a good balance of power and thinness.

I thought many ways to affix the magnets to the back of the light fixture, Super Glue, J&B Weld, etc, but in the end it seemed to me that gaffers tape (again sourced from Amazon) given the inherent flexibility of the tape.

IMG_20190512_141623094

Also the tape should help with avoiding metal-on-metal contact when mounting the fixture.

A bit of trimming, and voilà, it’s done.

 

Method for evaluation of natural language translation engine via canon texts

Executive summary

Disclosed is the capability to leverage the canon texts in different languages to derive and evaluate the effectiveness of language translation models for application on non-canon text. This is specifically useful when generating initial language translation models for languages, which are as yet unclassified, such as native/indigenous languages.

 

Background and Problem statement

The word canon comes from the Greek κανών, meaning rule or measuring stick. In fiction, canon is the material accepted as officially part of the story in an individual universe of that story. In works of fiction, canon provides thus a structure for internal consistency within the fictional universe itself.

The Bible has been translated into many languages from the biblical languages of Hebrew, Aramaic and Greek. As of September 2016 the full Bible has been translated into 636 languages, the New Testament alone into 1442 languages and Bible portions or stories into 1145 other languages. Thus at least some portion of the Bible has been translated into 3,223 languages. Translations of the Qur’an are interpretations of the scripture of Islam in languages other than Arabic. Qur’an was originally written in the Arabic language and has been translated into most major African, Asian and European languages.

“Translation studies” is an academic interdiscipline dealing with the systematic study of the theory, description and application of translation, interpreting, and localization. As an interdiscipline, Translation Studies borrows much from the various fields of study that support translation. These include comparative literature, computer science, history, linguistics, philology, philosophy, semiotics, and terminology.

There are many mechanisms for performing machine translation of languages, from rule-based approaches, to the resurgence of statistical approach, which leverage word-based translation, phrase- based translation, syntax-based translation, hierarchical phrase-based translation, and etc. mechanisms. The statistical approach to machine translation is often seen as superior to the rule-based approach due to the latter’s requirement to formally develop the linguistic rules which are costly and do not apply well to the general case. Whereas the statistical approach leverages existing generally produces more fluent translations owing to the use of a language model. It stands to reason then that the efficacy of a translation job is directly related to the choice to the model used. The problem is then how to choose the model.

Historically, religious canon texts are amongst the first to be considered for translation into a new language. The formal nature of canonicity providing thus a roadmap for language scholars to agree on accurate translation of the written word, these static translations encode relationships, thus, between any two languages. This work is to extract and formalize these relationships such that evaluation of statistical language translation models can be performed in the abstract to ascertain the most effective, efficient and accurate model to be applied between any two given languages.

 

Novelty

A system and method for unambiguous evaluation and classification of the effectiveness or accuracy of any arbitrary language translation model between two languages.

Advantages and value

  • The advantage of this method is that by leveraging the peer-reviewed work done by translation studies scholars in performing translations of canon text we have ready-made ground truth of both the input and output state.

 

  • The value of this method is in being able to evaluate the effectiveness of a given statistical machine translation language model against another in the absolute. To be clear, our teaching provides the ability to train and refine the translation engine for a given language pair.

Method

Given a language pair (source/target) and a plurality of candidate language translation models

  1. Apply one initial translation model to translate canon text in source language to generate candidate canon text target in the target language.
  2. Perform word-based comparison of resultant candidate text against the canon text in the target language to generate a compatibility or faithfulness score. This score indicates the efficacy of the selected language model in translating from source to target as a scalar.
  3. The result is captured as a tuple of { source , target , model , score } values.
  4. Select another model from the plurality of candidate models. Repeat steps 1, 2, and 3 until all models exhausted
  5. Select the model with the highest score for given source/target language pair as the model to be employed.

In some embodiments, the use of an intermediary language is leveraged to translate between a source and target text. In such cases the

  1. Apply one initial translation model to translate canon text in source language to generate candidate canon text in the intermediary language.
  2. Apply one initial translation model to translate the candidate canon text in the intermediary language to generate candidate canon text in the target language.
  3. Perform word-based comparison of resultant candidate text against the canon text in the target language to generate a compatibility or faithfulness score. This score indicates the efficacy of the selected language model in translating from source to target as a scalar.
  4. The result is captured as a tuple of { source , target , intermediary , model1, model2, score } values.
  5. Select another model from the plurality of candidate models. Repeat steps 1, 2, and 3 until all models in all intermediary languages are exhausted.
  6. Select the model with the highest score for given source/target language pair as the models and intermediary to be employed.

Detail

maz20170408 - Page 1

Fig 1: Selection of best model for a given language pair

 

Given a language pair (source/target) and a plurality of candidate language translation models

  1. Apply one initial translation model to translate canon text in source language to generate candidate canon text target in the target language.
  2. Perform word-based comparison of resultant candidate text against the canon text in the target language to generate a compatibility or faithfulness score. This score indicates the efficacy of the selected language model in translating from source to target as a scalar. Resulting translation is compared with canon text in target language following ways:
    • If it is the same, maximum score is calculated.
    • If it is different, individual words are extracted from the translation and compared with their translations in canon and ranked properly.For example, word black and white are completely different, so for that comparison score will be very low, but not minimal because both words mean a color – it is better than translation of “white” to “carrot”.
      Similarly, synonyms are ranked a lot higher, word order etc. Simply said, it is trying to find out how is the meaning of the translated text similar to the canon in target language.
  3. The result is captured as a tuple of { source, target, model, score } values.
  4. Select another model from the plurality of candidate models. Repeat steps 1, 2, and 3 until all models are exhausted.
  5. Select the model with the highest score for given source/target language pair as the model to be employed.

Backpacking Table

IMG_20170123_162908

Motivation

Starting out in backpacking late in life, I came to the sport with a few pounds around the midriff, but also with a bachelor’s degree in Engineering, as well as enough ingenuity courtesy of the School of Hard Knocks. This provides the motivation for a solution that can be

  1. easily transportable, eg collapsible
  2. lightweight, and
  3. cheap,

knowing full well that achieving all three is a physical impossibility. What I’ve come up with what I believe a good compromise of these three properties, finding a sweet spot somewhere in the middle of the Triple Constraint Triangle. [more info]

WebTriangleInfoGraph

List of Materials

  • A bamboo cutting board, 11″ x 15″, sourced from my local Walmart for about seven bucks. [sample]
  • Two yard signs frames, sourced from my local Lowes for about two-and-a-half bucks. [sample]
  • A length of about 10-ft of bank line, which I happen to have kicking around the house. [sample]

List of Tools

  • A hacksaw such as this one with a blade for metal.
  • A drill, with a 1/8″ bit.
  • A marker.
  • Some masking tape.

Procedure

  1. Take the two yard sign frames and make a mark about 1″ above and below the cross bars. Because the crossbars are usually about 9″ apart, this gives a total length of about 11″ between the marks. Use the hacksaw to trim away the extra length from the yard sign.
    Apr 16, 2017 15-02-06
  2. Place a strip of masking tape to width-wise about 1-1/2″ from the edge
    Untitled
  3. Use the marker to put some ink on the newly made cut marks and press the edge of the frames onto the masking tape. This should transfer the ink onto the tape and give a guide for where to make drill marks. It should be somewhere around 1″ and 2″ from the corner.
    Apr 16, 2017 15-14-02
  4. Drill away.
  5. Make a Bowline [how-to] on the bank line and wrap it around the “handle” of the board.
  6. Place the yard sign frame through the newly drilled holes and run the string through so that the tension tries to spread the legs out. Secure the line with a clove hitch [how-to] at each of the two posts. Run the line across length of the table, underneath makes the surface more usable, and repeat the cloves and wraps at the bottom.
    IMG_20170416_141215 IMG_20170123_163300

When Marketing and Techies don’t speak

Time and again …

And so it goes that people talk past each other and they don’t even realize it. In my view this is what happened to Lumber 84 during their SuperBowl LI’s campaign. They drove traffic to a website that was ill equipped to handle the influx.

It takes a certain design to survive a flash mob’s interest. Ever wondered why the bathrooms at the stadium or the theater are significantly larger than those of in a train station, say Grand Central? Both facilities are design to handle lots and lots of people. The hallways are spacious, the materials used are hard/durable materials. So why would the bathrooms be so different? In two words, arrival rate. The arrival rate is what drives the design.

Simply put processing 100 widgets that arrive in a trickle means one or two processors would likely be sufficient. In contrast if those 100 widgets show up at once, the last widget processed will necessarily have a long wait time while all the previous 99 widgets are processed. For every one second of additional processing time, the 100th widget will spend an additional 99 seconds in queue, assuming only one processor. Open a second processor and the wait and delay are halved.

Specifics

The journey84.com site was not designed like a stadium or a theater, to handle a timed event, instead it was designed like a regular site. Consequently it is not surprising that when a flash mob showed up, it failed. Let’s take a look at this transaction below.

> GET / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: journey84.com
> Accept: */*
> 
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Length: 20263
< Content-Type: text/html; charset=utf-8
< Server: Microsoft-IIS/8.0
< X-AspNetMvc-Version: 5.2
< X-AspNet-Version: 4.0.30319
< X-Powered-By: ASP.NET
< Set-Cookie: ARRAffinity=e1c140a4aab77c745107aadc5e7989608b845ae8bef3dccacc8aa1d26a8caebe;Path=/;Domain=journey84.com
< Date: Mon, 06 Feb 2017 01:40:56 GMT

Cache-Control: private

The server is saying that the home page for the website should not be cached by so-called public proxies. Only end-browsers should cache the page. The server is saying only I can give the page to the browser. Helpful proxies need not apply. Consequently all browser’s request will come to the server.

Set-Cookie: ARRAffinity

The Azure Request Router Affinity cookie is enabled. Why would you do that? What is so special about this particular website that requires session affinity to the server? This feature should have been disabled.

In the end…

…the result was easily predictable and what most people saw was this

16406832_10210842263226245_7661275328903454649_n

Hardly a compelling marketing campaign.

Barbicans for Cloud Environments

Abstract

Public cloud environments require system administrators access the cloud hosts for system level activities over untrusted networks. In order to maintain perimeter security so-called jump or bastions hosts are used to reduce the attack surface. This paper discusses a mechanism to provide very strong bastion host through the use of shared SSH keys with planned obsolescence in combination with individual SSH key. The result is a bastion that is secure, with minimal burden to administrators for user access maintenance.

High Level Concepts

Barbican

A barbican is a fortified outpost or gateway, such as an outer defense to a city or castle,
or any tower situated over a gate or bridge, which was used for defensive purposes. Usually barbicans were situated outside the main line of defenses and connected to the city walls with a walled road called the neck. Deployment of two bastion hosts straddling a firewall can serve the function and neck to allow controlled access to the protected cloud environment.

 jan-8-2017-11-41-14

Bastions

The external bastion, being outside the firewall, is exposed to the world. The internal bastion, being inside the firewall, is only accessible by the external bastion. Each bastion host provides the door while the firewall enforcement creates the neck of the barbican.  Entities wanting to transit the barbican must authenticate against both bastion checkpoints.

image001

Authentication Domains

Going through the trouble of creating a two-bastion barbican is all for naught if there is only one set of tokens that will allow transit through both checkpoints.  Consequently it is beneficial to require two sets of authentication tokens to successfully transit the barbican.

image003

Just like the concentric walls of a castle have outer walls that are lower than the inner walls, so should the stringency of the authentication tokens mimic the strength of the walls. The outer authentication domain token may be shared amongst all authorized users of the cloud environment, while the inner authentication domain token should be individualized to each user.  The outer authentication domain token could be further divided by role, eg sysadmin, netadmin, appadmin, dba, etc.

Tokens

SSH’s key-based authentication is used as the mechanism for exchanging authentication tokens with each of the bastion hosts. The tokens are configured to behave differentially in each of the domains.  The table below summarizes the differences.

Property of Token

Outer domain

Inner domain

Scope

Shared

Individual

Lifetime

Ephemeral

Persistent

Generation

Automated
by Service

User-generated

Server-side Enablement

Periodically by
Service

Once at creation
time

Client-side Enablement

Periodically
by User

Once
at creation time

Implementation

Service Key Generation and Distribution

Cron-based, PGP encoded, upload to FTP or webserver for distribution.  Once
for each role.

Service Key Installation and enablement

Cron-based, build the authorized_keys file with n-many generations, upload to the external bastion. Once for each role.

External Bastion

Simple host. No end-user accounts. Provide no services. Guard against escalation of privilege Provide command for accessing the internal bastion for convenience. Routinely clean shared user’s home directory. Use IPTables to control inbound access to ports

Internal Bastion

No shared accounts. Provide limited services. Guard against
escalation of privileges.

Firewall

Use a firewall to enforce no sideways access to the internal bastion. This is the equivalent of the neck in the barbican.

Syslog

Without monitoring all systems succumb. Simple syslog receiver to listen in on the comings and goings of the barbican.

jan-8-2017-12-01-43

Operations

Initial setup

User provides own public key to service for installation as part of initial procurement of authorized user access. User downloads and installs the service’s ephemeral/service key onto their workstation. The ephemeral key must be renewed periodically by the user.

jan-8-2017-11-47-39

Day-to-day Use

User initiates SSH session with external bastion using the service’s ephemeral key. User initiates second SSH session with internal bastion using their own persistent key through SSH Agent forwarding methods.

jan-8-2017-11-48-04