Connect with us


The Convergence of Semantics and Machine Learning



The Convergence of Semantics and Machine Learning

Artificial intelligence (AI) has a long history of oscillating between two somewhat contradictory poles.

On one side, exemplified by Noam Chomsky, Marvin Minsky, Seymour Papert, and many others, is the idea that cognitive intelligence was algorithmic in nature – that there were a set of fundamental precepts that formed the foundation of language, and by extension, intelligence. On the other side were people like Donald Hebb, Frank Rosenblatt, Wesley Clarke, Henry Kelly, Arthur Bryson, Jr., and others, most not even as remotely well known, who developed over time gradient descent, genetic algorithms, back propagation and other pieces of what would become known as neural networks.

The rivalry between the two camps was fierce, and for a while, after Minsky and Papert’s fairly damning analysis of Rosenblatt’s Perceptron, one of the first neural model, it looked like the debate had been largely settled in the direction of the algorithmic approach. In hindsight, the central obstacle that both sides faced (and one that would put artificial intelligence research into a deep winter for more than a decade) was that both underestimated how much computing power would be needed for either one of the models to actually bear fruit, and it would take another fifty years (and an increase of computing factor by twenty-one orders of magnitude, around 1 quadrillion times) before computers and networks reached a point where either of these technologies was feasible.

As it turns out, both sides were actually right in some areas and wrong in others. Neural networks (and machine learning) became very effective in dealing with many problems that had been seen as central in 1964: image recognition, auto-classification, natural language processing, and systems modeling, among other areas. The ability to classify, in particular, was a critical step forward, especially given the deluge of content (from Twitter posts to movies) that benefit from this.

At the same time, however, there are echoes of Minsky and Papert’s arguments about the Perceptron in the current debate about machine learning – discoverability and verifiability are both proving to be remarkably elusive problems to solve. If it is not possible to determine why a given solution is correct, then it means that there are significant hidden variables that aren’t being properly modeled, and not knowing the limits of those variables – the places where you have discontinuities and singularities, make the model far more questionable when applied to anything but its own training data.

Additionally, you replace the problem of human intervention in developing logical (and sometimes social) structures with the often time and people-intensive operation of finding and curating large amounts of data, and it can be argued that the latter operation is in fact just a thinly disguised (and arguable less efficient) version of the former.

The algorithmic side of things, on the other hand, is not necessarily faring that much better. There are in fact two facets to the algorithmic approach – analytical and semantic. The analytical approach, which can be identified as being currently defined as Data Science, involves the use of statistical analysis (or stochastics) to determine distributions and probabilities. Stochastics’ strength arguably comes in that it can be used to determine, for a sufficiently large dataset, the likelihood of specific events occurring can be established to within a certain margin of error. However, stochastics is shifting from traditional statistical analysis to the use of Bayesian networks, in which individual variables (features) can be analyzed through graph analysis.

Semantics, on the other hand, is the utilization of network graphs connecting assertions, as well as the ability to make additional assertions (via modeling) about the assertions themselves, a process known as reification. Semantics lends itself well to more traditional modeling approaches, precisely because traditional (relational) modeling is a closed subset of the semantics model, while at the same time providing the power inherent in document-object-modeling languages (DOMs) such as exemplified by XML or JSON.


Significantly, a Bayesian network can be rendered as a semantic graph with reification, as can a decision tree. Indeed, a SPARQL query is isomorphic to a decision tree in every way that counts, as each node in a decision tree is essentially the intersection of two datasets based upon the presence of specific patterns or constraints (Hint: you want to build a compliance testing system? Use SPARQL!).

The history of software is both full of purists and less full of pragmatists. Purists put a stake in the ground regarding their own particular set of tools and languages: C++ vs. Java, Imperative vs. Declarative, SQL vs NoSQL, Perl vs. … well, just about anything, when you get right down to it. Pragmatists usually try to find a middle ground, picking and choosing the best where they can and covering their ears to all of the sturm and drang of the religious wars when they can’t. Most purists ultimately become pragmatists over the years, but because most programmers tend to become program management over the years, the actual impact of such learning is minimal.

Right now, because the incarnations of all three of these areas – neural networks, Bayesians, and semantics – are relatively new, there is a strong tendency to want to see one’s tool of choice as being the best for all potential situations. However, I’d argue that each of these are ultimately graphs or tools to work with graphs, and it is this underlying commonality that I believe will lead to a broader unification. For instance,

  • A machine learning pipeline is a classifier. If the labels of the classifier in the middle correspond to a given ontology, then once a given entity has been classified, a representation of that entity semantically can be assigned to the relevant patterns, shapes, classes, or rules.
  • A machine learning system is not an index, but as my kids would say, it is index-adjacent (what a very graph-like phrase). In essence, what you’re doing is creating a map between an instance of an unknown type and its associated class(es). The plural term is important here because a class is not a thing, it is only a labeled pattern, with inheritance, in turn, being the identification of common features between two such patterns. This map is also occasionally referred to as an inverse query, in that rather than retrieving all items that satisfy the query, you are in essence retrieving the (named) patterns that the query utilizes for one of those items.
  • It is possible (and relatively simple, to be honest) to create classifiers in SPARQL. This is because SPARQL essentially is looking for the existence of triple patterns, not just in terms of property existence, but in terms of often secondary and tertiary relationships. SHACL, an RDF schematic language, can be thought of as a tool for generating SPARQL based upon specific SHACL constructs (among other things) and those patterns can be very subtle.
  • In a similar fashion, I believe that graph analytics will end up becoming as (or even more) important compared to relational data analytics, primarily because graphs make it much easier to add multiple layers of abstraction and discoverability to any kind of stochastic process, resolving many of the same issues that machine learning tools today struggle with.
  • The inverse of this process is also feasible. SPARQL can be utilized with incoming streams to create a graph that serves to build training data for machine language services. Because such training data will already have been labeled and identified within the context of existing ontologies, the benefit of such a process is that the resulting classifiers already have all the pieces necessary for explainability – data provenance and annotations, established identifiers, event timestamps, and more.
  • One other important point – SPARQL is able to change the graphs that it works with. Inferences, in which new assertions are created based upon patterns found in existing assertions, become especially important once you incorporate service calls that allow for the processing of external content directly within the SPARQL calls themselves. One of the next major evolution points for SPARQL will be in its ability to retrieve, manipulate and produce JSON as intermediate core objects (software vendors, please take note) or as sources for RDF.
  • This means that a future version of SPARQL no longer has to store tabular data as RDF, but instead could store it as JSON then utilize that JSON (and associated analytics functions) to create far more sophisticated inferences with a much smaller processing footprint. For an analogous operation, take a look at the XProc XML pipeline processing languages then realize that the differences between the XSLT/XQuery pipelines and the RDF/SPARQL/SHACL pipelines are mostly skin deep.

This last point is very, very important because as the latest iterations of the Agile / DevOPS / MLOps model show, pipelines and transformations are the future. By being able to work with chained transformations (especially ones where the specific pipes within that transformation are determined based upon context rather than set a prior) such pipelines begin to look increasingly like organic cognitive processes.

Read More


Why it’s time to ‘embrace the discomfort’ with cloud vendor lock-in



Why it’s time to ‘embrace the discomfort’ with cloud vendor lock-in

Vendor lock-in, alongside security, are issues that have pervaded IT and software procurement, whether computing has been centralized or not. In the era of the cloud, with benefits ranging from scalability to speed, the hoped-for panacea has turned out to be less than expected.

For a while, the vendors and analysts thought they’d cracked it with the gloss of multi-cloud. At the start of 2018, Cloud Academy issued a whitepaper that looked to separate multi-cloud strategy from the hype. More than 80% of enterprises reported ‘moderate to high’ levels of concern about being locked into a single public cloud platform, according to a Stratoscale survey of the time.

Cloud Academy’s conclusion: it can help, but it is not a requirement. “The key to staying flexible even within a single platform is about the choices you make,” the company noted. “Building in degrees of tolerance and applying disciplined design decisions as a matter of strategy can ensure flexibility and portability down the road.”

For Dave Moore, chief innovation officer at technology consulting firm Growth Acceleration Partners (GAP), many companies are thinking about vendor lock-in from the wrong angle. The key concerns include the data themselves, flexibility and portability, but perhaps the most important is speed.

Moore emphasises a quote attributed to the late Eric Pearson, formerly chief commercial and technology officer at Intercontinental Hotels Group: it’s no longer the big beating the small, but the fast beating the slow. 

“If you can go ahead and commit to one [provider], and not worry about being locked-in, go for the speed,” he says. “Let’s start making mistakes because we’re going too fast, not because we’re going too slow.”

Moore takes aim at the idea of “write once, run anywhere” (WORA) for cloud in a blog post, seen as a viable way to move workloads across vendors. When it comes to the portability of Java – about which the original slogan was coined in the 1990s – then no problem. But while your code can be portable if it’s running in containers, the database service, distributed cache or message queue on which your stack also relies is more difficult to sort.


“This idea you can write once run anywhere – good luck with that,” says Moore. “If you manage to accomplish that, it’s going to take you three times as long anyway for that to work.” He adds, in a not entirely unserious manner, that if you are able to achieve true WORA for cloud, then you must pivot to that solution as your main product as it will be much more valuable than your current one.

If you are a startup, then the multi-cloud approach is likely to be a non-starter due to lack of resources and time anyway. But if you are a larger organisation, then the call may come to explore more than one of the big three – AWS, Azure, or Google Cloud Platform – if not all of them.

Moore tells a story of his time at EA, who was all-in on a single provider, when his studio was in the final stages of a releasing a game seven years in the making. EA, as the overall publisher — who tends to mandate which technologies can be used — sent a diktat to explore being able to run in other providers. Moore’s response? Sure — just add another three years to the timeline.

Scalability is the cornerstone for cloud customers, being able to spin up VMs and workloads at will. For the providers, it is this data play that is their cornerstone. Ingress is free, but egress incurs a charge. 

According to a 2018 survey taken at the Gartner Symposium, up to 95% of business and IT leaders said they saw cloud billing as the most confusing part of public cloud adoption. To give a simple example, if you wish to transfer 25 terabytes of data, this would be in the ballpark of $2,500 per transfer. 

For those looking at egress charges and squirming, Moore notes there is little that can be done. “They’re not stupid,” Moore says of the cloud providers. “They’ll say ‘give me your data’, because moving that out is going to be ‘kerching’, and so that’s where they’re going to get you.

“The sad part of that is there’s no real solution, other than keeping your data on-prem; then you’d have latency issues and all sorts of problems like that,” Moore adds. “So that’s one of those where you just think ‘we’ll have to pay for that when we get there.’ But look at it this way – the costs of doing that are minuscule compared to trying to create something that would work in multiple providers.”

Ultimately, there is no true panacea, just a series of not perfect options. Contrary to popular belief, Moore believes, going all-in with cloud-native is the least-worst of these options.


“The main thing is to just embrace the discomfort,” adds Moore. “At some point, you’ve got to decide who you’re going to marry.”

For more information about Growth Acceleration Partners, please visit

Tags: , , ,

Source link

Continue Reading

Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address