Skip to main content
Abstract—Beam forming is central to any wireless system. In different channel and signal scenarios one can use MMSE or MSINR criteria for finding an optimal beamformer. It is well known in literature that if the channel is unknown then... more
    • by  and +1
This paper presents a predictive controller for handling plug-and-play (P&P) charging requests of electric vehicles (EVs) in a distribution system. The proposed method uses a two-stage hierarchical control scheme based on a model... more
    • by 
    •   7  
      Power SystemModel Predictive Control (MPC)Integrated Renewable Energy SystemSmart Grid
In recently proposed multiple access techniques such as IDMA and OFDM-IDMA, the user separation is done by user specific interleavers in contrast to conventional CDMA scheme where user separation is assured with user-specific signature... more
    • by 
    •   4  
      EngineeringTechnologyPerformance AnalysisPrime Number
"We present a novel solution to the problem of robotic grasping of unknown objects using a machine learning framework and a Microsoft Kinect sensor. Using only image features, without the aid of a 3D model of the object, we implement a... more
    • by  and +1
We consider the problem where M agents interact with M identical and independent environments with S states and A actions using reinforcement learning for T rounds. The agents share their data with a central server to minimize their... more
    • by 
Teleoperated surgical robots can provide immediate medical assistance in austere and hostile environments. However, such scenarios are time-sensitive and thus, require highbandwidth and low-latency communication links which might be... more
    • by 
    • Computer Science
In the future, deployable, teleoperated surgical robots can save the lives of critically injured patients in battlefield environments. These robotic systems will need to have autonomous capabilities to take over during communication... more
    • by 
Many engineering problems have multiple objectives, and the overall aim is to optimize a non-linear function of these objectives. In this paper, we formulate the problem of maximizing a non-linear concave function of multiple long-term... more
    • by 
    •   3  
      EngineeringComputer SciencearXiv
We consider the problem of constrained Markov Decision Process (CMDP) where an agent interacts with a unichain Markov Decision Process. At every interaction, the agent obtains a reward. Further, there are K cost functions. The agent aims... more
    • by 
    •   3  
      EngineeringComputer SciencearXiv
Many real-world problems like Social Influence Maximization face the dilemma of choosing the best K out of N options at a given time instant. This setup can be modeled as a combinatorial bandit which chooses K out of N arms at each time,... more
    • by 
Quantum key distribution (QKD) allows two distant parties to share encryption keys with security based on laws of quantum mechanics. In order to share the keys, the quantum bits have to be transmitted from the sender to the receiver over... more
    • by 
Mean field control (MFC) is an effective way to mitigate the curse of dimensionality of cooperative multi-agent reinforcement learning (MARL) problems. This work considers a collection of Npop heterogeneous agents that can be segregated... more
    • by 
We consider the bandit problem of selecting $K$ out of $N$ arms at each time step. The reward can be a non-linear function of the rewards of the selected individual arms. The direct use of a multi-armed bandit algorithm requires choosing... more
    • by 
Many real-world problems face the dilemma of choosing best $K$ out of $N$ options at a given time instant. This setup can be modelled as combinatorial bandit which chooses $K$ out of $N$ arms at each time, with an aim to achieve an... more
    • by 
    •   3  
      MathematicsComputer SciencearXiv
We consider the problem where N agents collaboratively interact with an instance of a stochastic K arm bandit problem for K ≫ N . The agents aim to simultaneously minimize the cumulative regret over all the agents for a total of T time... more
    • by 
    •   2  
      Computer SciencearXiv
Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating... more
    • by 
    •   3  
      MathematicsComputer SciencePhysics
We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL) with convex constraints. Various learning applications with constraints, such as robotics, do not allow for policies that can violate... more
    • by 
    •   2  
      Computer SciencearXiv
Reinforcement Learning (RL) is being increasingly applied to optimize complex functions that may have a stochastic component. RL is extended to multi-agent systems to find policies to optimize systems that require agents to coordinate or... more
    • by 
    •   3  
      MathematicsComputer SciencearXiv
Reinforcement learning is widely used in applications where one needs to perform sequential decisions while interacting with the environment. The problem becomes more challenging when the decision requirement includes satisfying some... more
    • by 
    •   2  
      Computer SciencearXiv
Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating... more
    • by 
    •   3  
      MathematicsComputer SciencePhysics