New Delhi, 13 June 2024: In the global race of AI innovation, India is positioning to take the lead by addressing critical data accessibility challenges. EY’s latest whitepaper ‘Enabling AI development in India through data access’ outlines key recommendations for public and private sectors, emphasizing the need to develop institutions and mechanisms that incentivize sharing of proprietary data, necessitates a robust data framework, setting of data marketplace and a need to invest in development of standards for interoperability of data to drive AI innovation.
Whitepaper highlights significant challenges such as privacy and confidentiality concerns, lack of clarity around data ownership, lack of interoperability between AI system/data and more importantly, generation of large volumes of synthetic data should be resolved to facilitate easier data access and foster AI innovation in India.
Reflecting on the findings, Rajnish Gupta, Partner, Tax and Economic Policy Group, EY India, said "Access to proprietary datasets will be a key differentiating factor between countries that emerge as winners and those that are unable to leverage the AI opportunity. The government's proactive role in establishing data frameworks, along with the private sector's participation in data sharing and standardization, will be pivotal in creating a robust AI ecosystem.”
EY analysis identifies four key thematic pillars emerging from government initiatives to improve data accessibility for AI development.
- Institutional capacity and governance: Establishment of a specialized agency for managing vast amounts of data and overseeing the development and management of the data ecosystem. Investments to facilitate the development of eco-system and expedite the launch of data and AI marketplaces and data commons.
- Establishing privacy frameworks: Laws and regulations that ensure adequate privacy safeguards for personal data. Processing and use of personal data is with the consent of the user.
- Facilitating access to non-personal/private sector data: Frameworks/ regulations/ rules to promote sharing of proprietary data through marketplaces/ data exchanges, setting up interoperability standards, incentivizing private participation in data marketplaces and clarification on issues like title, etc.
- Making government data available: Government has access to vast amounts of data that can be used as training datasets. Publishing data and digitizing these records as open data can enable the development of AI models in local languages.
- Similarly, businesses can leverage data-driven strategies and new AI tools to enhance efficiency in their operations.
- Standardization of datasets: Private companies should focus on standardizing their datasets to ensure consistency and usability across various AI applications. This involves annotating and labelling data accurately to facilitate interoperability.
- Securing existing datasets: Ensuring the security of datasets is paramount. Companies need to implement robust measures to protect data from unauthorized access and breaches, maintaining data integrity and privacy.
- Enabling access to datasets: Private entities should work towards making non-personal data available for AI development. This can be achieved through data-sharing agreements, licensing, and participation in data marketplaces.
- Review and compliance measures: Regular reviews and compliance checks are necessary to ensure datasets meet current standards and regulatory requirements. This includes updating data governance policies and ensuring transparency in data handling practices.
- End -