Shefali Srivastava

About Me

I am a Masters student of Computer Vision at Carnegie Mellon University for Fall 2021. My research interests are in developing reliable and sustainable machine learning and computer vision systems that can be deployed in the real world. Specifically, I am interested in computer vision assisted medical imaging, 3D reconstructions, vision applications in biometrics and vision coupled with natural language processing.

I’m an avid software developer currently working with Adobe Systems. At Adobe, I work in the Digital Experience (DX) Cloud responsible for user experience. As a part of team Flow Service, I am responsible to ensure smooth data ingestion to Adobe Experience Platform (AEP) from multiple third party data sources. I also hold the sole ownership of open-source ETL SDK provided by Adobe to ingest data in Adobe standard XDM format onto AEP. The SDK is available here.

I earned my Bachelor’s in Information Technology in Spring’18 at Netaji Subhas Institute of Technology, Delhi University, where I was advised by Prof. Deepak Kumar Sharma on developing intelligent ML based optimised routing algorithms for Opportunistic Networks.

During the 3rd year of undergraduate degree, I spent one wonderful winter doing research in point-to-point matching for 3D image reconstruction at the Visual and Parallel Computing Lab at the University of Georgia (UGA), advised by Prof. Suchendra M. Bhandarkar. The research results available are here. Post return, I continued to enhance the algorithm developed at UGA by employing end-to-end deep learning. The results are available here.

In my free time, I continue to work on my interests in Machine Learning, Computer Vision and Natural Language Processing by taking up the latest courses and reading the latest research publications in these areas. My most recent achievement is a Patent Approval for my Project “BERTMap: Automatic Schema Mapping based on Semantic Data Type Detection”.

My CV is available here. Feel free to reach out to me at shefali9625@gmail.com.

Patents

BERTMap: Automatic Schema Mapping based on Semantic Data Type Detection

Shefali Srivastava*, Simran Aggarwal*, Rishika Karira

Schema Mapping is the process of mapping different schema standards to smoothly transform data from one format to another. Every organization requires data to be in a consistent format for use. Data Consistency demands incoming data to be mapped before on-boarding onto any platform. Given tons of data in the software industry, manual mapping is a tedious task. There is a need for Auto Mapping. Automatic Schema mapping in itself is a challenging problem to solve owing to format specific conversions. We propose a novel clustering based deep learning model BERTMap which maps input data to target schema fields. Clustering on schema fields is defined on semantic data types which eliminates frequent retraining of a model when new target schema fields are provided. Via clustering, seamless integration of hierarchical schemas is achievable. The model enables seamless integration of hierarchical schemas via clustering based on semantic data types, eliminating the need for frequent retraining. BERTMap achieves an F1 score of 0.97 showing superiority to state of the art semantic data-type detection models for the task of schema mapping.

Publications

BeamAtt: Generating Medical Diagnosis from Chest X-Rays using Sampling Based Intelligence

Aman Sawarn, Shefali Srivastava, Monika Gupta, Smriti Srivastava

Medical Imaging coupled with Image Captioning is fuelling the possibilities of generating accurate medical reports with minimal human intervention. In economically downtrodden nations, this produces opportunity for the poor to acquire world-class treatment from around the globe with an efficient time to market. Chest X-Ray images are integral to the task of diagnosis and treatment of respiratory problems. We propose BeamAtt: an end-to-end deep CNN-RNN based encoder-decoder framework that incorporates spatial visual attention to generate a terse diagnosis from Chest X-ray films. We choose to use a GRU RNN decoder. To boost performance over state-of-the-art methods with complex architectures, we employ Sampling based techniques along with Beam Search Optimisation while generating inferences and argue that a simpler framework with intelligent optimisation is able to successfully achieve higher performance metrics. We show how vivid attention plots can provide deep insight into the region of the image on which the network concentrates to generate a word token. We compare our model with recent prior art using standard evaluation metrics BLEU-1/2/3/4 and ROUGE-L and demonstrate superiority of the proposed method. BeamAtt achieves a BLEU-1 score of 0.56 and CIDEr score of 2.077 which is a significant boost in performance over contemporary solutions.

Neural Network based Routing Protocol for Opportunistic Networks with Intelligent Water Drop Optimization

Shefali Srivastava*, Shubham Kumaram*, Deepak Kumar Sharma

Opportunistic Networks (OppNets) is a system of wirelessly connected nodes in a varying network topology. Routing in OppNets is a challenge. To overcome the problem of routing, an intelligent dynamic strategy to select the next best node for forwarding a message is required. This paper proposes an intelligent routing mechanism based on Intelligent Water Drop (IWD) Algorithm which is used in tandem with Neural Networks (NNs) as an optimization technique to solve the problem of routing in such networks. The nature–inspired IWD algorithm provides robustness, whereas the neural network base of the algorithm helps it to make intelligent routing decisions. The weights in the Neural Network model are calculated by IWD Algorithm using training data consisting of inputs that are characteristic parameters of nodes, such as buffer space, number of successful deliveries and energy levels along with transitive parameters such as delivery probabilities. The proposed protocol Intelligent Water Drop Neural Network (IWDNN) is compared with other protocols that use similar ideologies such as MLProph, K‐nearest neighbour classification based routing protocol (KNNR), Cognitive Routing Protocol for Opportunistic Network (CRPO), and Inheritance Inspired Context Aware Routing Protocol (IICAR), as well as the standard protocol Prophet. IWDNN is shown to outperform all other protocols with an average message delivery ratio of 60%, which is a significant improvement of over 10% in comparison to other similarly conceived algorithms. It has one of the lowest latency among the protocols studied, in a range of 3000 to 4000 s, and incurs comparably low overhead costs in the range of 15 to 30. The drop ratios are one of the lowest, staying near six and approaching zero as buffer size is increased. Average amount of time a message stayed in the buffer was the lowest, with a mean of 1600 s.

Matching Disparate Image Pairs using Shape Aware Conv-Nets

Shefali Srivastava*, Abhimanyu Chopra*, Arun C.S. Kumar, Suchendra M. Bhandarkar, Deepak Sharma

An end-to-end trainable ConvNet architecture, that learns to harness the power of shape representation for matching disparate image pairs, is proposed. Disparate image pairs are deemed those that exhibit strong affine variations in scale, viewpoint and projection parameters accompanied by the presence of partial or complete occlusion of objects and extreme variations in ambient illumination. Under these challenging conditions, neither local nor global feature-based image matching methods, when used in isolation, have been observed to be effective. The proposed correspondence determination scheme for matching disparate images exploits high-level shape cues that are derived from low-level local feature descriptors, thus combining the best of both worlds. A graph-based representation for the disparate image pair is generated by constructing an affinity matrix that embeds the distances between feature points in two images, thus modeling the correspondence determination problem as one of graph matching. The eigen-spectrum of the affinity matrix, i.e., the learned global shape representation, is then used to further regress the transformation or homography that defines the correspondence between the source image and target image. The proposed scheme is shown to yield state-of-the-art results for both, coarse-level shape matching as well as fine point-wise correspondence determination.

Deep Spectral Correspondence for Matching Disparate Image Pairs

Arun C.S. Kumar*, Shefali Srivastava*, Anirban Mukhopadhyay, Suchendra M. Bhandarkar

A novel, non-learning-based, saliency-aware, shape-cognizant correspondence determination technique is proposed for matching image pairs that are significantly disparate in nature. Images in the real world often exhibit high degrees of variation in scale, orientation, viewpoint, illumination and affine projection parameters, and are often accompanied by the presence of textureless regions and complete or partial occlusion of scene objects. The above conditions confound most correspondence determination techniques by rendering impractical the use of global contour-based descriptors or local pixel-level features for establishing correspondence. The proposed deep spectral correspondence (DSC) determination scheme harnesses the representational power of local feature descriptors to derive a complex high-level global shape representation for matching disparate images. The proposed scheme reasons about correspondence between disparate images using high-level global shape cues derived from low-level local feature descriptors. Consequently, the proposed scheme enjoys the best of both worlds, i.e., a high degree of invariance to affine parameters such as scale, orientation, viewpoint, illumination afforded by the global shape cues and robustness to occlusion provided by the low-level feature descriptors. While the shape-based component within the proposed scheme infers what to look for, an additional saliency-based component dictates where to look at thereby tackling the noisy correspondences arising from the presence of textureless regions and complex backgrounds. In the proposed scheme, a joint image graph is constructed using distances computed between interest points in the appearance (i.e., image) space. Eigen Spectral decomposition of the joint image graph allows for reasoning about shape similarity to be performed jointly, in the appearance space and eigenspace. Furthermore, a new benchmark dataset consisting of disparate image pairs with extremely challenging variations in scale, orientation, viewpoint, illumination and affine projection parameters and characterized by the presence of complete or partial occlusion of objects, is introduced. The proposed dataset is supplemented with ground truth interest point annotations and is the largest and most comprehensive amongst publicly available image datasets pertaining to the problem of disparate image matching. The proposed scheme yields state-of-the-art performance in the case of both, coarse-grained shape-based correspondence determination as well as fine-grained point-wise correspondence determination on two existing challenging datasets as well as the newly introduced dataset.

Biometric authentication using local sub- space adaptive histogram equalization

Gopal Chaudhary, Shefali Srivastava, Smriti Srivastava

In biometrics authentication systems, such as palmprint recognition, fingerprint recognition, dorsal hand vein recognition and palm vein recognition etc., image enhancement plays a crucial role for most of the low resolution image samples. In this work, a novel adaptive histogram equalization (AHE) variant is proposed as effective area-AHE (EA-AHE) with weights. Here, global adaptive histogram equalization is improved using a local AHE technique by varying the effective area with different effective weights. The method is found to improve the biometric authentication identification rate as compared to the typical AHE. To validate the proposed algorithm, IITD palmprint databases of left and right hand are used in the simulations. Finally, it is validated through results that proposed technique is superior to the existing ones.

Information Fusion in Animal Biometric Identification

Gopal Chaudhary, Smriti Srivastava, Saurabh Bhardwaj, Shefali Srivastava

This work presents the application of biometrics in animal identification, which is a highly researched topic in human recognition. Here, our analysis presents the identification of zebra in their natural habitat. All the techniques are tested on 824 Plains zebra images captured at Ol’Pejeta conservancy in Laikipia, Kenya. We have used coat strips as a biometric identifier which is unique in nature. To improve the performance of identification, information fusion of coat strips can be taken place from many points in zebra skin such as near legs, stomach and neck. Here two region near stomach (flake) and first limb (leg) is cropped from the textural pattern of strips of zebra is used in feature extraction. GMF, AAD, mean, and eigenface feature extraction methods are applied on flake and limb ROI of zebra. Then a novel image enhancement method: difference subplane adaptive histogram equalization is applied to improve the identification rate. Our technique is based on information fusion in fusing the score from stomach (flake) and first limb (leg) region. For this, sum, product, frank T-norm, and Hamacher T-norm rules are applied to validate the identification results. Information fusion improves the identification results from the previous reported results from eigenface, CO-1 algorithm, and stripecodes. The improvement in results verifies the success of our approach of information fusion using score level fusion.

Experience

Adobe Systems

Software Developer II

July 2018 - present

Worked as a part of team Flow-Service in Digital Experience to ingest data from multiple third party sources onto Adobe Experience Platform. Worked closely with Informatica on ETL-SDK provided by Adobe to ingest data in Adobe standard XDM format onto Adobe Cloud available as open source on github. Also Implemented framework for data transfer to One Data Model as part of a cross company partnership of Open Data Initiative between Adobe, Microsoft and SAP.

Royal Bank of Scotland

Software Development Intern

May 2017 - July 2017

Created a Dashboard for Support Team of Corporate Business Data Mart (CBDM) Department for identification of timely incoming of Upstream feeds and generation of extracts for downstream services using Oracle PL/SQL and Oracle APEX. Also implemented small level design of OLAP Cubes.

Projects

Project Based Learning to Biometrics: Analysis of scenarios affecting performance of Speaker Recognition systems.

A successful training program was developed for pre-final semester undergraduate students based on project based learning (PBL). Such demonstrations play an important role in teaching practical concepts and skills in engineering. For this biometrics has been chosen as a case study. Biometrics is the science and technology of measuring and analyzing biological data. In information technology, biometrics refers to technologies that measure and analyze human body characteristics, such as DNA, fingerprints, eye retinas and irises, voice patterns, facial patterns and hand measurements, for authentication purposes. The success of the program relies on a centralized approach for generating a live biometric system that includes understanding the system, data collection, feature extraction, modeling the system, and analyzing different scenario affecting the performance of system. This provides a platform for students to develop their creativity in designing Biometric Laboratory. It also offers real-life applications to be used for raising general public security and identification systems.

Stock Market prediction using Neural Networks

Comparison of Multi-Layer Feed Forward Neural Network with Radial Basis Function Neural Networks to approximate highly non-linear functions.

In the financial sector, the sales price forecasting is a hot issue. Since the indices associated with the stock are nonlinear and are affected by various internal and external factors, they are very difficult to model and pose a difficult problem to be solved by the researchers. This project is devoted to designing an intelligent prediction model based on the radial basis function network (RBFN). To tune its parameters a learning algorithm is developed using the back-propagation (BP) method. The performance of the proposed method is also compared with that of the multi-layered feed-forward neural network (MLFFNN) containing only a single hidden layer and the results obtained from the simulation study indicate that the performance of RBFN is better as compared to the MLFFNN model.

Multimodal biometric system with new Palm-Phalanges Database

Construction of Palm Print Database and Application of Simple ML Algorithms for Biometric Identification at NSIT

To ensure the high performance of a biometric system, various unimodal systems are combined to evade their constraints to form a multimodal biometric system. Here, a multimodal personal authentication system using palmprint, dorsal hand vein pattern and a novel biometric modality “palm-phalanges print” is presented. Firstly, we have collected a new anterior hand database of 50 individuals with 500 images at the institute referred to as NSIT Palmprint Database 1.0 by using NSIT palmprint device. Then from these anterior hand images, database for palmprint and palm-phalanges is created. In this biometric system, the individuals do not have to undergo the distress of using two different sensors since the palmprint and palm-phalanges print features can be captured from the same image, using NSIT palmprint device, at the same time. For dorsal hand vein, Bosphorus Hand Vein Database is used because of the stability and uniqueness of hand vein patterns. We propose fusion of three different biometric modalities which includes palmprint (PP), palm-phalanges print (PPP) and dorsal hand vein (DHV) and perform score level fusion of PP-PPP, PP-DHV, PPP-DHV and PP-PPP-DHV strategies. Lastly, we use K-nearest neighbor, support vector machine and random forest to validate the matching stage. The results proved the validity of our proposed modality and show that multimodal fusion has an edge over unimodal fusion.

Education

Netaji Subhas Institute of Technology, Delhi University

BE in Information Technology, CGPA - 9.4/10

August 2014 - June 2018

Delhi Public School, Dwarka

Class XII, 96%

Class X, CGPA - 10/10

April 2000 - June 2014

Incoming Grad Student - MSCV, Carnegie Mellon University