publications
publications and preprints in reverse chronological order.
2020
- ECCV’20S3Net: Semantic-Aware Self-supervised Depth Estimation with Monocular Videos and Synthetic DataBin Cheng, Inderjot Singh Saggu, Raunak Shah, Gaurav Bansal, and Dinesh BharadiaEuropean Conference on Computer Vision (ECCV) 2020
Solving depth estimation with monocular cameras enables the possibility of widespread use of cameras as low-cost depth estimation sensors in applications such as autonomous driving and robotics. However, learning such a scalable depth estimation model would require a lot of labeled data which is expensive to collect. There are two popular existing approaches which do not require annotated depth maps: (i) using labeled synthetic and unlabeled real data in an adversarial framework to predict more accurate depth, and (ii) unsupervised models which exploit geometric structure across space and time in monocular video frames. Ideally, we would like to leverage features provided by both approaches as they complement each other; however, existing methods do not adequately exploit these additive benefits. We present S3Net, a self-supervised framework which combines these complementary features: we use synthetic and real-world images for training while exploiting geometric, temporal, as well as semantic constraints. Our novel consolidated architecture provides a new state-of-the-art in self-supervised depth estimation using monocular videos. We present a unique way to train this self-supervised framework, and achieve (i) more than 15% improvement over previous synthetic supervised approaches that use domain adaptation and (ii) more than 10% improvement over previous self-supervised approaches which exploit geometric constraints from the real data.
- ICML WorkshopAI-based Monitoring and Response System for Hospital Preparedness towards COVID-19 in Southeast AsiaTushar Goswamy, Naishadh Parmar, Ayush Gupta, Raunak Shah, Vatsalya Tandon, Varun Goyal, Sanyog Gupta, Karishma Laud, and 3 more authorsICML Workshop on Healthcare Systems, Population-Health, and the Role of Health-Tech, 2020
This research paper proposes a COVID-19 monitoring and response system to identify the surge in the volume of patients at hospitals and shortage of critical equipment like ventilators in South-east Asian countries, to understand the burden on health facilities. This can help authorities in these regions with resource planning measures to redirect resources to the regions identified by the model. Due to the lack of publicly available data on the influx of patients in hospitals, or the shortage of equipment, ICU units or hospital beds that regions in these countries might be facing, we leverage Twitter data for gleaning this information. The approach has yielded accurate results for states in India, and we are working on validating the model for the remaining countries so that it can serve as a reliable tool for authorities to monitor the burden on hospitals.
- ICML WorkshopIIT Kanpur Consulting Group: Using Machine Learning and Management Consulting for Social GoodTushar Goswamy*, Vatsalya Tandon*, Naishadh Parmar*, Raunak Shah*, and Ayush Gupta*ICML Workshop on Healthcare Systems, Population-Health, and the Role of Health-Tech, 2020
The IIT Kanpur Consulting Group is one of the pioneering research groups in India which focuses on the applications of Machine Learning and Strategy Consulting for social good. The group has been working since 2018 to help social organizations, nonprofits, and government entities in India leverage better insights from their data, with a special emphasis on the healthcare, environmental, and agriculture sectors. The group has worked on critical social problems which India is facing including Polio recurrence, COVID-19, air pollution and agricultural crop damage. This position paper summarises the focus areas and relevant projects which the group has worked on since its establishment, and also highlights the group’s plans for using machine learning to address social problems during the COVID-19 crisis.
2021
- PreprintDELFI: Deep Mixture Models for Long-term Air Quality Forecasting in the Delhi National Capital RegionNaishadh Parmar, Raunak Shah, Tushar Goswamy, Vatsalya Tandon, Ravi Sahu, Ronak Sutaria, Purushottam Kar, and Sachchida Nand TripathiarXiv preprint arXiv:2210.15923 2021
The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key novelty in DELFI is its multi-scale approach to the forecasting problem. The observation that point predictions are more suitable in the short-term and probabilistic predictions in the long-term allows accurate predictions to be made as much as 24 hours in advance. DELFI incorporates meteorological data as well as pollutant-based features to ensure a robust model that is divided into two parts: (i) a stack of three Long Short-Term Memory (LSTM) networks that perform differential modelling of the same window of past data, and (ii) a fully-connected layer enabling attention to each of the components. Experimental evaluation based on deployment of 13 stations in the Delhi National Capital Region (Delhi-NCR) in India establishes that DELFI offers far superior predictions especially in the long-term as compared to even non-parametric baselines. The Delhi-NCR recorded the 3rd highest PM levels amongst 39 mega-cities across the world during 2011-2015 and DELFI’s performance establishes it as a potential tool for effective long-term forecasting of PM levels to enable public health management and environment protection.
2023
- ICDE’23Towards Optimizing Storage Costs on the CloudKoyel Mukherjee*, Raunak Shah*, Shiv Saini, Karanpreet Singh, Khushi, Harsh Kesarwani, Kavya Barnwal, and Ayush ChauhanInternational Conference on Data Engineering (ICDE) 2023
We study the problem of optimizing data storage and access costs on the cloud while ensuring that the desired performance or latency is unaffected. We first propose an optimizer that optimizes the data placement tier (on the cloud) and the choice of compression schemes to apply, for given data partitions with temporal access predictions. Secondly, we propose a model to learn the compression performance of multiple algorithms across data partitions in different formats to generate compression performance predictions on the fly, as inputs to the optimizer. Thirdly, we propose to approach the data partitioning problem fundamentally differently than the current default in most data lakes where partitioning is in the form of ingestion batches. We propose access pattern aware data partitioning and formulate an optimization problem that optimizes the size and reading costs of partitions subject to access patterns. We study the various optimization problems theoretically as well as empirically, and provide theoretical bounds as well as hardness results. We propose a unified pipeline of cost minimization, called SCOPe that combines the different modules. We extensively compare the performance of our methods with related baselines from the literature on TPC-H data as well as enterprise datasets (ranging from GB to PB in volume) and show that SCOPe substantially improves over the baselines. We show significant cost savings compared to platform baselines, of the order of 50% to 83% on enterprise Data Lake datasets, ranging from terabytes to petabytes in volume.