NSW Government’s Chief Data Scientist on Data Governance Frameworks to Protect Privacy, Security, and Ethical Standards
Dr Ian Oppermann is the NSW Government’s Chief Data Scientist working within the Department of Customer Service. He is also an Industry Professor at the University of Technology Sydney (UTS).
In the lead up to presenting at ADAPT’s gathering of top Chief Data Officers, Ian speaks with Senior Research Strategist, Aparna Sundararajan, on his thoughts around data sharing frameworks.
When asked what the most crucial action leaders can take today to create future-ready information architectures, Ian answered that managing data quality was the single, most important priority.
Ian also discusses purpose-led data custodianship, wherein risks and governance must consider how the end-user leverages data.
He also hints at a near-future of deploying machine learning and artificial intelligence on the NSW Government’s datasets.
This is our first event for Chief Data Officers, Chief Data Architects, and engineers, the analytics and visualisation teams. From their perspective, in that context, what is your session going to deliver for them?
Thanks very much for having me and especially for the sneak peek, but New South Wales has been on quite a data journey. We started some time ago with the data analytics centre. We’ve been doing data-driven projects using data science. And over the years, we’ve realised that there are some really big frameworks we need to put in place.
Last year, we released our AI strategy policy and how-to guide. We also released our Smart Cities, Smart Places, Strategy and Frameworks. We’ve just recently released our Data Sharing Strategy.
That may seem to have the cart before the horse or the wrong way around, but data sharing is the big game.
As we’ve been moving along, we’ve been trying very hard to reframe the conversation around outcomes and indicators and build the frameworks for data sharing and governance for different types of data.
Not all data sets are the same, but they also identify the data as an asset. Identify the most valuable data assets within the New South Wales government, and genuinely treat them as assets and build the systems and frameworks around them that allow us to appropriately value the data that we’ve got.
That’s great. I’m glad you mentioned data as an asset because I was speaking with Rocky just today. He also mentioned data as an asset, treating it as an intangible asset, and creating value. Now it seems like there are different rules for data as such within organisations. And we’re talking a lot about how we create value externally.
But when we look at the ADAPT survey that we conducted, we also see huge internal challenges from the data. The value of data that has not been able to be generated presents a lot of challenges.
There are different roles in terms of Chief Data Officer, Chief Data Architect, engineers, data visualisation, and analytics teams. And everyone seems to be working in a different accord. How do you from your experience and especially how you planned the whole data journey within the New South Wales Government, what are some of the best practices you are seeing? What’s working?
You just asked a very, very big question. One of the things about data is that it has many, many different uses, and it has different values to different people depending on the use.
The reason you want to use data matters quite profoundly.
It matters because when you’re thinking of the role of the data custodian, thinking about all the different risks and governance issues associated with the use of that data, that data custodian needs to look down at the end-use case. The way the data gets there to understand whether or not the frameworks are appropriate, the use is appropriate, the authorising mechanisms are appropriate.
The other aspect is, of course, that not all data sets are the same. Some data sets are really sensitive and personal. And the sensitivities come from so many different dimensions. Some data sets are not that sensitive but still have personal information.
Some data sets are very sensitive and do not have personal information, and some are neither sensitive nor personal. So again, the issues you need to think about the roles and responsibilities you need to contemplate depend very much on the data subject itself.
It’s about how that data is going to be used again, those sensitivities, along that data lifecycle. So the upshot of all of this is it’s complicated.
It’s complicated depending on how you intend to use the data and for what purpose. So that complexity often inhibits data sharing and use and often inhibits the release of value from data.
The first question was around the value of data itself, treating data as an asset.
It seems insane that we rely so much on data, but we don’t have an accounting standard that would recognise data as an asset.
We are genuinely trying to value the data assets we create, the data assets that we use. And some of it is what it costs to create.
Some of it is, what would someone pay to access the data? And we say pay in nominal terms, in terms of government, and what would the damage, or the impact or harm be if the data was accidentally released. Trying to develop frameworks like that gives us a sense of the value of the data asset and, again, thinking across the entire data life cycle, where we need to put in place governance and processes.
That then speaks to the second part of the question you asked, and that’s the different roles. Because we look across data from the actual creation to the transmission, storage, analysis, and use of insights or reuse of insights to archiving, and then potentially deletion, the roles change during the course of that.
It is important to think about the roles that swap in and out, whether a data engineer or a data architect or data scientist, that essentially has different levels of primary responsibility at different stages in that data life cycle.
The final part of that is that it’s so complicated and the roles change so much depending on where you are in the data life cycle and the context that we need to simplify it.
One of the aspects of the new data strategy is, can we take all of those complexities and map them to a couple of different templates. In this high trust environment, you need to have expert people working in a highly authorised, highly secure environment.
This needs to be considered through to, this is data we’re going to release as open data. The terms and conditions and frameworks and expectations you’ve put on those different layers of levels of trust or levels of security need to flatten out, need to simplify because quite honestly, the complexity is so great, if we don’t map it to three or four or five templates, then we get nowhere. That was a very, very long answer, but you asked a very, very big question.
That was a very succinct answer, Ian. I think it makes a lot of sense, especially with looking at what our audiences are struggling with are exactly those things. What is the entire outcome that they’re trying to achieve?
Then what is the life cycle or the journey of data, and from what you shared, I almost feel like the whole journey can be divided into the actual roles of people who take care of data at each journey so that the outcomes can be achieved. And then it sort of becomes a circle.
I’m keen to understand more about the framework and how it tackles these areas in the session. What are you personally most excited about for 2021-’22 from the data perspective?
Yeah, that’s a really good question.
At least looking around at my colleagues, the use of data is inevitable.
The use of smart, whether it be artificial intelligence or more sophisticated machine learning, is inevitable.
As we continue to deliver a better experience for the customers of New South Wales, and that’s the language we’re using in New South Wales, not citizens, but customers, as we deliver better, more personalised, more seamless services to the customers in New South Wales.
We are deliberately but cautiously moving forward with the increased use of data. The exciting part is that I think we now, after at least my perspective, after all of these years of doing and trying, are starting to build useful frameworks that allow us to talk about things in terms of outcomes. These outcomes are described in very public principles, which we can argue about, and we can debate.
This is a language that we will describe in human terms, but we’re now starting to build the pathway between the principles and the bits. We’re starting to build the frameworks that allow us to connect what we want to achieve to measures, risk frameworks and the policies associated with that down to… And this is what we do with this data set. And this is what we do with this data set under these circumstances.
That journey between the principles and the bits is starting to make sense.
And we’re starting to build genuine frameworks that allow us to move down to what an engineer or an architect or a data scientist put in should do, back out to the real world of principle-based outcomes.
That’s exciting. It’s starting to come together.
Wow. That’s, that’s, really, really interesting.
If you had to just give one piece of advice as to one thing that every organisation should definitely, just start with, that 1% improvement on their data strategy, what would that be?
That’s, again, a tricky question. When discussing trying to achieve these big real-world outcomes, we talk about some no regrets activities, and the no regrets activities typically refer to improving your data capability.
You don’t need to be perfect to get started. You should just get started and try to improve your ability to use and analyse.
But if you ask for one thing, it’s data quality. Data quality is the biggest of the sensitivities that drive people away from data sharing and use.
People are worried about their data quality; they’re worried about insights that are generated. They’re worried about how it reflects on their organisation. As a no regrets activity, invest in improving data quality.
Oh, great. What stuck with me is valuating data. The asset value, the framework, and the no regrets attitude. Well, thank you so much again for talking to me. I hope the audiences are going to really like your sessions and take away a lot from them.
Thank you. Alright, fantastic, I look forward to it.
NSW Government’s Chief Data Scientist, Ian Oppermann, will be presenting on Data Sharing Frameworks and the NSW Government Data Strategy at ADAPT’s Data Edge in 2022.
Discover more about Ian’s presentation and the full Data Edge agenda here.