It seems to me that the majority of people in the workforce are simply applying patterns to events. In that they aren't re-creating the wheel day in and day out. Perhaps they learn the basic techniques a decade or so ago, and every day in which they work, they simply know what to look for, then apply some remedy, and on to the next.
When you hear about a particular profession being all knowing and solving complex problems, it's my belief they aren't. They are simply applying the formula stated earlier.
After a discussion with my father a while back, after dumping a list of complaints, he said I should number them. After numbering the complaints, any time in the future, when I have the same complaint, just say the number. Hi Dad, 42, 71 and 63. Great, what else?
I think most of our job titles today can be learned by anyone with any level of degree or intelligence. Simply learn the process, learn the patterns, learn when to apply certain remedies to certain scenarios, and there you go. Nobody has time to learn all the idiosyncrasies of each and every thing, like plumbing, electricity, law, medicine, etc. They lock you in early to a specific trade, and you never venture out. It's not that each topic is complex or impossible to learn, who has the time.
Now I will say this. In each and every case, you have to handle the exceptions. Because they always happen. And knowing how to handle them causes the cream to rise to the top. Servers down, losing money by the second, who can fix it? Patient is dying, who can save them? Water leaking through the house, who can fix it?
Each person has experiences over time. And having the ability to apply that learned knowledge at the right time is called intelligence. I think we can train computers to not only handle the day in and day out repetitive tasks that comprise almost every job function, but I also think we can train the machines to handle the exceptions, to learn over time, learn on the fly, apply troubleshooting techniques at millions of thought processes per second, to weigh the best possible outcomes vs. the possible downside possibilities. It's just a matter of time. And when that happens, I'm really not sure what humans will do for a living.
And there you have it~!
Much of the business world is turning towards Data Driven organizations. By that, they are leveraging their down into assets, to derive insight and produce change. As in more profit, reduce costs and streamlined processes.
With the rise of Big Data, Open Data Sources and Self Service Reporting, a new role has sprung forth to handle the massive amounts of collected data. The Chief Data Officer is one such role. What are some of the issues confronting the Chief Data Officer?
Organizations have always collected data, going back to the punch card days. Since data is now center stage, Data Quality has become very important. The Chief Data Officer is responsible to provide accurate reports based on accurate data. By removing duplicate data, incorrect data and outdated data, the reports can be consumed by staff to run the business. Thus, becoming a data driven organization.
IT has gotten a bad reputation over the years for not meeting the needs of the business. Their reporting efforts were fairly slow, using a first in first out queue system. The reports were not trusted by the business for accuracy. Many times, multiple reports showed different results, leading to rogue developers hired by individual business units. Report maintenance was sometimes an issue, as the report writers hard coded the business logic into the reports, which was undocumented and made changing business rules cumbersome.
Proprietary databases purchased through specific vendors made data integration difficult. Typically, there was no easy way to mash data sets across desperate vendor databases, so data integrated was a big problem. That led to reports on specific segments of the business. For example, Call Center reports would report on calls made to the call center only. They were not tied to the Financial data or the Customer data, making the process of identifying trends difficult. They were not able to see the entire picture as the reports were fragmented by business unit. They may have had some Dashboards at the Executive level, but not for widespread consumption.
Some organizations had report writers who knew an entire segment of the business. Internal staff who knew the business, and could easily write reports with the ability to translate into information. Because the report writer had insights into the data and the business, they sometimes had elevated status, as in the gatekeepers of the information. The business units were at the mercy of the report writer. They may have 50 report requests in the queue, so prioritization was difficult. If the report writer suddenly left the company, that left a gaping hole in the business, because all the business knowledge just walked out the door, stored only in the report writers brain, undocumented.
For a long time, reports were an afterthought. Yes, we just spent a million dollars on the new system, perhaps we should create some reports now. Very common theme for a long time. With the rise of data driven organizations, the need for formal process’ have arisen. Data Dictionaries to document where the data is. Documents describing the business and how data flows from one system to another. Where are the servers located, who maintains them, how often are they patched or recycled? Do we have a formal process to validate the reports prior to release, to ensure accuracy? Can we create a process to report on the data before the month end runs? What system do we use to archive the reports, store off the reports in source code repository, and a very important question, who owns the data?
Sometimes the business wants their reports, and they want them now. In some instances, a user from the business would send the same exact report request to three different people, to see which one would complete first, taking up valuable time and resources. Sometimes the business users would spend more time trying to prove the inaccuracy of the data. Many times, a user makes additional requests to the original, known as scope creep. And more often than not, some users would completely bypass the report queue and go directly to a report writer for a quick report, completely bypassing the queue. Lastly, some users put in dozens of report request, monopolizing the time of the limited IT staff.
Lack of Data Culture
In some organizations, reports are not trusted because the top executives don’t believe in them. They don’t support a data culture. In that, reports are an afterthought. Sometimes the reporting infrastructure doesn’t receive adequate budget. Other times, there’s no collaboration between staff to discuss the reports. Or no mobile strategy for real time reporting. Although Data Warehousing has been around for twenty years, many organizations do not put the effort into a centralized data solution with a single source of the truth. Typically, reporting issues do not appear overnight. They grow slowly over time, snowballing, until they become unmanageable. How does an organization get their house in order?
The Executive leadership team needs to commit to a change in direction on how they handle data. There needs to be a consensus at the top to transform into a “Data Driven” culture. To assist in this endeavor, a new role has emerged to manage the data within an organization, known as the Chief Data Officer or CDO. The CDO typically reports to an Executive and is responsible for all aspects of the data. By that, he or she is tasked with taking a data inventory for every piece of data within the organization's ecosystem. For example, what applications are connected to what database? Who owns each system? Who Maintains the system? Who Archives the system? How often? What data is stored in each repository. What applications pull data from those repositories? What reporting tools are used within the organization? Who are the report writers and data developers? Which departments depend on that data? What are their current reporting needs and future expectations? Where are the systems hosted, On-Premise, the Cloud or a Vendor's site? What documentation already exists?
Last but not least, the Chief Data Officer needs to document the processes for each system.
For example, who does what when? How does the data flow through each of the systems? What are the timing issues and dependencies? Does someone massage the data along the path, data entry? What are the business rules? Are they documented?
Once everything is documented, the analysis can begin. First off, are there any data redundancies. Can we consolidate databases to a single vendor or server to reduce license costs and bigger discounts? Can we migrate some of the data or reporting solutions to the Cloud? Can we reduce the number of Business Intelligence tools to streamline development and costs? Do we have a data quality team in place? Is there a document or WIKI that contains the business logic within the organization? Does it document the flow of data through each of the servers and departments? Should we bring in consultants to get a jump start on a particular domain or technology? Can we leverage internal staff for domain knowledge and expertise through a matrix system by borrowing experts from different departments for specific projects? Do we have a formal process to submit and track tickets for new data requests? Do we have multiple people cross trained in the data and analytics in order to back up the developer while on vacation or if they’re hit by a bus? Are we following “Best Practices” and proper coding standards? Are we promoting the "Data Culture" from the top down, like an octopus having tentacles in every department?
The Chief Data Officer is a fairly new concept. Yet almost every organization has data to be managed, like an asset. By introducing a Chief Data Officer into a centralized department, who's responsible for all the data within the organization, you begin the journey of the data driven culture. Which will accelerate sales, reduce costs and streamline processes. All of which can impact your bottom line.
Once the Chief Data Officer gets the data house in order, the next step will be to leverage your data assets into real insights. Perhaps introduce Big Data to analyze unstructured or semi structured data. Or bring in a team of Data Scientists to do statistical computations, predictive analytics or machine learning. The main goal of the Chief Data Officer is to turn data into an asset by documenting and streamlining processes, managing the data for accuracy, quality and consistency, so decisions made in the organization are based on data.
What Problem Trying to SolveOne new technology finding its way into the mainstream is called “Information of Things”, or “IoT”. This cutting edge technology’s main function is to access connected devices to a central hub. Data flows back and forth from embedded sensors out in the wild. The small packets of data, contained in messages, from sensor to central hub, contain pulses of information. Each message gets stored, analyzed, are fires an event trigger downstream for some action to occur. The messages typically flow through the Internet, as well as Radio Signals, although no specific protocol has been established. This segment of technology has potentially to create a planet of connected devices, flowing information, in real time, to improve customer experience, reduce costs and provide new applications and services.
IoT is a disruptive, cutting edge technology. By connecting devices to the internet in real time, we can manage millions of devices by communicating with billions of messages. In order for that to happen, we need to have a framework in place. That framework consists of a few key components.
Hardware:Sensors reside in a variety of products today. For example, home thermostats capture key metrics and report them back to a central hub. That information gets stored and can be accesses via applications on websites, tablets and mobile phones. However, some of the sensors are proprietary to specific Vendors.
VendorsVendors are typically responsible for the hardware devices. And some devices contain proprietary sensors and software applications, which can be “black boxes” in that the programmers can’t see what’s going on under the hood. And some hardware devices may not work with other Vendor products. And some of those hardware devices fail over time and need to be replaced. Perhaps years from now. Will each Vendor still be in business then, if not, how will the device get replaced or repaired?
Embedded SensorsThe main premise behind the Internet of Things surrounds the idea of embedded sensors that reside within devices. These sensors capture metrics in real time. Those metrics are specific to each type of sensor. A home thermostat might capture the temperature in Celsius or Fahrenheit, the humidity level, the date time stamp, how many people are in the room, energy consumed in specific time duration and a lot more. Sensors that reside in airplane engines may capture millions of data points per minute. It’s important to have the metrics sent back to the central hub, for storage and evaluation. How does that information get sent there?
Operating SystemsEach sensor that resides within a device must be able to run software. The software must run on an “Operating System” or “OS”. The operating system must be able to run independently and receive updates and patches over time. The central hub communicates with each device for troubleshooting, repair and reboots. Think of the Mars Rover, that runs on another planet. It runs independently, yet captures metrics for transmission back to earth. NASA has the ability send commands to the Operating System to initial reboot, place in sleep mode or send updates to the OS to fix bugs.
ProtocolsIoT sensors capture data points and flow them to the central hub. They use a message “protocol” to push and pull data packets. The messages conform to specific structure, size, credentials to ensure they reach their destination. Messages can be encrypted to prevent unwanted eyes scanning the data. Typically, the message systems over the internet, using specific Protocols.
Data StorageThe incoming message need a place to reside once it reaches the central hub. In most cases, the data is stored in a data repository, such as a Relational Database. However, due to the high volume of incoming data, Hadoop is used to store the data across low end commodity servers using distributed architecture. Likewise, due to the huge volume potential, as in millions of transactions per second, sometimes the data is scanned as it flows in and never stored off for later retrieval. This technology is called Streaming Analytics.
Streaming AnalyticsStreaming Analytics scans incoming data to look for patterns or outliers. An outlier may be an anomaly or deviation from the norm. This could initiate a predefined alert to perform some action. It can also call another alert. For example, perhaps one metric indicates a failed part within a device. This alert could kick off a job to see if the issues exists within similar parts. And perhaps check the Vendor data to see if a product recall was issued. And if the number of failed units exceeds a certain threshold, perhaps issue a warning to other customers with similar units by issuing an email alert or text. Or perhaps, the issues have already been identified and a corrective patch created. The action could be to push a software patch to the failed device to correct the issue.
NetworksEmbedded sensors out in the wild need a mechanism to flow its data to the host server. One way to accomplish this is through the Internet. Perhaps a device is connected to a Wi-Fi device residing within the home. As the metrics are captured, messages can be routed to a switch, which flows only the relevant data to the central hub, to reduce overhead network bandwidth. Some vendors use radio frequency as their main mechanism to flow data. Ultimately, there needs to be a connection between the two endpoints. Some connections don’t need a constant handshake to be open all the time, they simply send small packets of data as they occur or at specific time intervals. However, other connection must be constant, in that they maintain “state”. That ensures a communication between central hub and devices out in the field, for whatever reason. Also, the networks can be secured to ensure no holes are exposed to hackers within the home network as well as people snooping the data packets as they flow across the world on the internet.
SoftwareIn order to get the Internet of Things applications created and operational, a software package can be purchased off the shelf. Likewise, teams of developers can also write their own custom software applications. When doing so, they pick a language they know, that meets the criteria for the application specifications and can support going forward. Internet of Things applications can be written in a variety of languages and it just depends on your team’s skill set at the time. Or you can outsource the IoT project to a consulting firm who specializes in such projects.
IoT Adoption Obstacles
SecuritySecurity has recently been addressed as a potential flaw in the adoption of IoT. Unfortunately, any software can be hacked and any device connected to the internet can be penetrated. With all the devices and all that data flowing back and forth, there’s a potential for security hacks and data breaches along the way. The IoT device you installed to monitor your garage door could in fact expose your home network for infiltrators to get in, read and copy your files or install malware or Trojan horses. Recently, people have discovered IoT devices that are not locked down nor encrypted, resulting in wide open networks. There’s even a site that shows live mobile cameras in real people’s homes, to the entire world, without the users knowing.
No Standard ProtocolThere is not definitive message protocol ensuring strict security, any Vendor can implement any system bypassing any security measures.
Shelf LifeIf we look at an IoT application put in production today, what happens five years from now. When the Vendor who manufactured the sensor is out of business and it becomes difficult to find replacement sensors when they go bad. Same with software patches, operating system updates and hardware malfunctions. How much time, effort and capital will it cost to revamp pieces of the IoT system five years from now when technology has pivoted in a different direction?
Lack of Qualified DevelopersThe world of IT is exploding. There are new products and services released every single day. Finding people with the right talent and skills to architect, create and maintain an end to end IoT system is not an easy task. And what if the project has turnover where developers enter and leave the project. How does the knowledge transfer occur and retained in-house? How do you find people with architecture skills, software development skills, database skills, streaming analytics skills as well as network, hardware, operating system and message protocol skills? Perhaps teams are assembled to bridge the gap of multiple skills.
OwnershipAn Internet of Things application can touch many pieces of an organization. The software architects design the system, the network administrators configure networks, new servers have to be configured, databases created, connections to devices, Vendor agreements, Service Level Agreements, Quality Assurance people, Operations, Business Units, Sales, Accounting departments. An entire plethora and assortment of people, places and departments. So who owns the application. Again, perhaps a team of departments or perhaps a new role, “Chief Internet of Things Officer”.
For the IoT adoption at massive scale, as in billions of devises, we’ll need a good mechanism to power the devices with minimal energy supplies. Although some devices are microscopic in size, they still need some way to maintain power 24 hours per day. And how do we service all these devices out in the field when their power supply goes out?
One of the nice features about the Internet of Things is the ability to have constant communication with devices out in the field. To provide better quality service and custom fit your product to your customers. And having the ability to automate the entire process is a huge benefit. Potential cost savings, increased revenue by increasing customer base, as well as offering new products and services. And potentially reducing staff.
New Data Sources
IoT allows organization new data streams. This data can be stored forever. And it can analyze on the fly or two years from now. To look for patterns and identify key pieces of insight. Also, the ability to link the IoT data with internal data sets of bump it up against external data sources. Perhaps we notice an increase in energy consumption based on patterns in the weather. How can we improve our product or service based on that piece of information? We can derive insights over time by studying the patterns. One way to do that is through Machine Learning. Supervised Learning can detect patterns. And Unsupervised Learning can learn over time with minimal human input.
Reports and Dashboards
Having new data sets allows new reports and dashboards to be created. For consumer consumption in real time via a variety of devices. And two-way communication allows users to make changes to IoT devices remotely in real time. This new reporting capability empowers its users.
Smart Homes & Cities
Imagine having control of almost any device in your home while on vacation or at the office. Smart homes are popping up all over giving the homeowners offering new levels of service and products. Smart cities would allow constant and interactive communication while walking down the sidewalk or driving down the street.
Sensors can now be purchased to monitor basic metrics for the human body. They can track the number of steps you take each day, your heart rate, blood pressure and you can view the results on your smart phone. What if healthcare professionals had access to this valuable information for preventive care instead of treating when you get to the emergency room? Huge costs benefit potential. These apps can monitor your sleep to look for sleep disorder patterns and solutions. Lots of potential.
IoT technology is upon us. Offering the ability to manage billions of devices and messages at scale. To store the data indefinitely. Analyze for insights. Constant communication with remote devices. For widespread adoption of IoT, perhaps we need to define specific standards, frameworks and protocols that are open source, non-proprietary and work with a variety of languages. The IoT will continue to leverage the Cloud platforms, to centralize the hubs for flow of data. As well as the interrogation of data for insights. As well as storage. Perhaps we’ll clearly define the landscape for IoT ecosystems to train the next batch of developers to create connected devices that scale, are secure and provide value at minimal costs and ease of maintenance.
I signed up for the Hortonworks Certified Associate exam last Thursday. Figured if I sign up, I'd have to take the test. And if I tak...
This blog post is in no way an attempt to steal other people's work. It's basically an conglomeration of notes from research I did...
Saw a post today on Twitter, " Microsoft releases CNTK, its open source deep learning toolkit, on GitHub " This is big news. Be...