This paper primarily intends to develop a GIS (geographical information system)-based data mining approach for optimally selecting the locations and determining installed capacities for setting up distributed biomass power generation systems in the context of decentralized energy planning for rural regions. The optimal locations within a cluster of villages are obtained by matching the installed capacity needed with the demand for power, minimizing the cost of transportation of biomass from dispersed sources to power generation system, and cost of distribution of electricity from the power generation system to demand centers or villages. The methodology was validated by using it for developing an optimal plan for implementing distributed biomass-based power systems for meeting the rural electricity needs of Tumkur district in India consisting of 2700 villages. The approach uses a k-medoid clustering algorithm to divide the total region into clusters of villages and locate biomass power generation systems at the medoids. The optimal value of k is determined iteratively by running the algorithm for the entire search space for different values of k along with demandesupply matching constraints. The optimal value of the k is chosen such that it minimizes the total cost of system installation, costs of transportation of biomass, and transmission and distribution. A smaller region, consisting of 293 villages was selected to study the sensitivity of the results to varying demand and supply parameters. The results of clustering are represented on a GIS map for the region.
It is a well known fact that electricity provides impetus to economic as well as human development. However, in majority of the developing countries, the major section of the population, especially in rural areas, is deprived of the benefits of electricity access as well as economic development. India is not an exception to this with 364 million poor people lacking access to electricity . On one hand, the conventional, fossil fuel based, centralized power generation systems have been catering mainly to meet the urban and industrial needs but have failed to address the energy needs of the rural poor . On the other hand, there is an uncontrolled exploitation of the finitely available fossil fuel resources for power generation. This necessitates an urgent need to explore renewable energy options that can be operated on decentralized mode at smaller capacities and are available in plenty. Among all the renewable energy alternatives to generate power, biomass energy route is considered to be advantageous, because: (i) They could be set up at any location where plant vegetation and animal rearing are present, (ii) It is available all round the year and there are no seasonal variations, ensuring non-intermittent supply and (iii) It is cheap, easily portable and environmental hazards are minimal . Woody biomass required for power generation can be generated without destroying the natural forests by growing dedicated plantations on abandoned and degraded land which do not have much competitive uses. Biomass can also be obtained from the crop residues from agricultural lands and plantations. The power generated from biomass offers other intangible benefits such as wasteland development, degraded land reclamation, environmental hazard reduction and local employment generation .
India has a large biomass resource potential in terms of agro and forest residues, and large tracts of approximately 40 million ha of wasteland for growing biomass. Current potential for power generation from Agro and forest residues alone is estimated to be 16,000 MW . Therefore, for India, expanding the usage of locally available biomass resources to generate electricity is a logical strategy to address the rural power challenge. The distribution of biomass is not strictly uniform in a geographical region and when locally available biomass resources are insufficient to meet the local electricity needs in the region, biomass has to be imported. When biomass is collected from many villages and transported to a power generation facility, logistic systems have to be designed more efficiently and they should address the issue of optimal location of biomass power facility.
The location decisions with respect to biomass power generation facility mainly depend on two components of variable costs. The first component relates to cost incurred on biomass procurement and biomass transportation from the dispersed sources to the power generation facilities (or biomass power generation systems). Second component of the cost is with respect to evacuating the generated electricity and supplying it to various demand centers (households, agriculture, micro-industry, etc.). In this context, the relevant costs are related to local transmission and distribution systems. Since these biomass power systems are of small capacities and decentralized, the dominant cost will be for the distribution system rather than transmission. Therefore, biomass energy systems can become operationally cost effective by strategizing the location of the energy systems by minimizing transportation cost of biomass from the source, and transmission and distribution cost from energy production system to the demand centers. As an enabler to support such decisions, in this paper, we present a discussion on the development and validation of a mathematical model to determine the optimal location of biomass power systems to minimize the cost of transportation. In addition, the model also facilitates decisions with respect to optimal installed capacity of the biomass systems by matching the given demand for power and available quantity of biomass. The key element in the current study is to cluster villages in the study area, define biomass power generation centers within the clusters, collect biomass from various villages (supply points) and transport sufficient quantities of biomass to the biomass power generation center within the cluster, in order to satisfy the power plant requirement at the least cost so as to meet the dynamic demand for power. Tumkur district in India consisting of 2700 villages is selected for validating the model. Two scenarios, one medium-term (2015) and the other one long-term (2030) were developed for validating model with projected demand for energy, dynamic loads and biomass potential.
In the last few decades, owing to emergence of multitudes of innovative applications in both private and public sector enterprises, a strategic approach had to be adopted to locate the facilities such as warehouses, hospitals, schools and fire stations. A vast number of distribution models were formulated to locate a facility and allocate the demand and capacity to these facilities. The objective of these models was to minimize the total installation and operational costs of the facilities. Each one of them differed from the other in mathematical structure, computational time and complexity.
A summary of continuous location models, network location models, integer programming problems and other state-of-the-art problems was presented by Klose and Drexel (2005) . Francis et al. (1983)  provided a review of literature of location analysis. Current et al. (1990)  reviewed the multi-objective aspects in the problem domain of facility location analysis. They found that most of the literature was comprised of cost minimization formulation, some of them dealt with demand oriented objectives, and only a few papers were of profit maximization type.
Melkote and Duskin (2001)  investigated a model that simultaneously optimizes the location and alters the network topology.They mention about different network location problems like set covering location problem, maximum covering location problem, P-median and P-center problems in which underlying network topology can have a significant impact on the optimal location decision. They also proved that their model can be successfully implemented in a number of applications like regional planning, power distribution, energy management and other areas. Syam (2008)  formulated a multiple server location-allocation model by considering most of the relevant costs and other parameters, namely, the transportation cost, facility cost, waiting time cost, queuing time, multiple server facilities and distance constraints.
Nema and Gupta (1999)  examined the waste-technology compatibility to locate waste treatment and disposal facilities. They formulated a multi-objective problem integrating both cost and risk parameters. Maniezzo et al. (1998) , have developed a decision support system for siting industrial waste management plants, minimizing the total costs and environmental impacts. The resulting formulation was NP-complete which could only be solved by adopting heuristic procedures. The optimal choice of location, technology, routing of hazardous waste was investigated by Alumur and Kara (2007) .