Indoor air quality (IAQ) in houses is often deteriorated by chemical substances emitted from heat... more Indoor air quality (IAQ) in houses is often deteriorated by chemical substances emitted from heating, building materials, or other household goods. Since it is difficult for occupants to recognize air pollution, they rarely understand the actual conditions of the IAQ. An investigation into the actual condition of IAQ in houses was therefore conducted in this study. Carbon dioxide (CO2) concentrations in 24 occupied houses was measured, and the results from our analysis showed that the use of combustion heaters increased the concentration of CO2 and led to indoor air pollution. Results indicate that as outdoor temperature decreased, the frequency of ventilation decreased simultaneously, and CO2 concentration increased. Results of the questionnaire survey revealed that the actual IAQ in each house did not match the level of awareness its occupants had regarding ventilation. Along with this difficulty in perceiving air pollution, the lack of knowledge about ventilation systems and the ...
With the advance of the Human Genome Project, a huge amount of various genome data has been store... more With the advance of the Human Genome Project, a huge amount of various genome data has been stored in a number of databases and the WWW system is widely used to access these databases. From the viewpoint of information supplier, the WWW is a quite useful tool to provide various types of data easily, but from the viewpoint of information consumer, it is not good enough because of lack of rigid data format and difficulty of data access. In this paper, by extending a current WWW browser, we propose two generic WWW tools; MetaViewer and MetaCommander, and try to apply them to the genome informatics to support researchers who search, analyze, and dispatch genome data, and discuss their potential advantages from the viewpoint of information consumer.
The multiple sequence alignment problem is one of the important problems in Genome Informatics. T... more The multiple sequence alignment problem is one of the important problems in Genome Informatics. The notable feature of this problem is that its state-space forms a lattice. Researchers have applied search algorithms such as A* and memory-bounded search algorithms including SNC to this problem. Unfortunately, previous work could align only seven sequences at most. Korf proposed DCBDS, which exploits the features of a grid, and suggested that DCBDS probably solved this problem, effectively. We found, however, that DCBDS was not effective for aligning many sequences. In this paper, we propose a simple and effective search algorithm, A* with Partial Expansion, for state-spaces with large branching factors. The aim of this algorithm is to store only necessary nodes for findingan optimal solution. In node expansion, A* stores all child nodes, while our algorithm stores only promising child nodes. This mechanism enables us to reduce the memory requirements during a search. We apply our alg...
This paper proposes an in-home behavioral observation method employing Internet of Things (IoT) s... more This paper proposes an in-home behavioral observation method employing Internet of Things (IoT) sensors. Behavioral change programs based on information provision approaches have begun to be employed in the reduction of carbon dioxide emissions in the residential sector. To improve efforts to save energy, a behavioral observation method that aims to understand the reality of users’ daily activities could be an effective approach. However, problems with existing methods include observations costs, privacy implications and the other complications regarding the specific behaviors of the person being observed. An in-home behavioral observation method employing IoT sensors is therefore proposed to both reduce costs and alleviate the privacy impact on user’s in-home activities. The use of sensor-based observation presents several relevant advantages. For example, the cost of sensor-based observation is relatively cheap compared to human-based approaches. In addition, it employs a minimum ...
With linear-storage search, the same nodes are eventually revisited many times because only the s... more With linear-storage search, the same nodes are eventually revisited many times because only the search paths are stored to memory. Some algorithms such as MREC have been proposed to solve this problem by storing nodes as well. MREC is an algorithm that reduces the number of nodes revisited by storing a certain number of nodes located near the root node. Proposed in this paper is stochastic node caching, which involves storing nodes on a probability basis. In so doing, only those nodes are stored that are visited frequently, so that the number of nodes revisited can be reduced efficiently while using limited memory resources. To prove the efficiency of stochastic node caching, this method was compared with MREC while pursuing the same goal. Experiments were performed using the 15-puzzle problem, a typical problem for linear-storage search, and a more complicated problem of gene alignment. The experiments illustrated the properties of the two algorithms, and proved that stochastic node caching is efficient in reducing the number of nodes revisited.
Electronics and Communications in Japan (Part II: Electronics), 2003
With the development of the human genome analysis project, it is becoming possible to utilize lar... more With the development of the human genome analysis project, it is becoming possible to utilize large-scale genome sequence data. One genome analysis method based on large-scale sequence data is genome sequence walking. Applying sequence walking to the segment sequence database, it is possible to estimate the whole sequence of the gene to which the query sequence belongs by using the gene segment. By sequence walking, the researcher can estimate the genome sequence without going through biological experiments. This saves time and expense in sequence determination. Sequence walking has been performed using the well-known BLAST. BLAST, however, is a tool based on similarity search, and is not adequate in sequence walking in which the same gene segments are connected, both from the viewpoint of efficiency and from the viewpoint of accuracy. In this study, it is shown that genome sequence walking is not a problem of similarity search, but is a string matching problem permitting error. A system dedicated to sequence walking is constructed by improving the string matching algorithm, which is more suited to sequence walking. The result has been publicized on the WWW. The proposed sequence walking system can realize sequence walking that is faster and more accurate than the conventional sequence walking by BLAST, thus reducing the burden on the researcher.
In data analysis, the necessary data are not always prepared in a database in advance. If the pre... more In data analysis, the necessary data are not always prepared in a database in advance. If the precision of extracted classification knowledge is not sufficient, gathering additional data is sometimes necessary. Practically, if some critical attributes for the classification are missing from the database, it is very important to identify such missing attributes effectively in order to improve the precision. In this paper, we propose a new method to identify the attributes that will improve the precision of Support Vector Classifiers (SVC) based solely on values of candidate attributes of a very limited number of entities. In experiments, we show the incremental addition of attributes by the proposed method effectively improves the precision of SVC using only a very small number of entities.
The multiple sequence alignment problem is one of the important problems in Genome Informatics. T... more The multiple sequence alignment problem is one of the important problems in Genome Informatics. The notable feature of this problem is that its state-space forms a lattice. Researchers have applied search algorithms such as A* and memory-bounded search algorithms including SNC to this problem. Unfortunately, previous work could align only seven sequences at most. Korf proposed DCBDS, which exploits the features of a grid, and suggested that DCBDS probably solved this problem, effectively. We found, however, that DCBDS was not effective for aligning many sequences. In this paper, we propose a simple and effective search algorithm, A* with Partial Expansion, for state-spaces with large branching factors. The aim of this algorithm is to store only necessary nodes for finding an optimal solution. In node expansion, A* stores all child nodes, while our algorithm stores only promising child nodes. This mechanism enables us to reduce the memory requirements during a search. We apply our algorithm to the multiple sequence alignment problem. It can align seven sequences with only 4.7% of the stored nodes required by A*.
Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on th... more Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on the current search path, but may revisit the same node again and again. This causes IDA* to take an impractically long time to find a solution. In this paper, we propose a simple and effective algorithm called Stochastic Node Caching (SNC) for reducing the number of revisits. SNC caches a node with the best estimate, which is currently known of the minimum estimated cost from the node to the goal node. Unlike previous related research such as MREC, SNC caches nodes selectively, based on a fixed probability. We demonstrate that SNC can effectively reduce the number of revisits compared to MREC, especially when the state-space forms a lattice.
Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on th... more Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on the current search path, but may revisit the same node again and again. This causes IDA* to take an impractically long time to find a solution. In this paper, we propose a simple and effective algorithm called Stochastic Node Caching (SNC) for reducing the number of revisits. SNC caches a node with the best estimate, which is currently known of the minimum estimated cost from the node to the goal node. Unlike previous related research such as MREC, SNC caches nodes selectively, based on a fixed probability. We demonstrate that SNC can effectively reduce the number of revisits compared to MREC, especially when the state-space forms a lattice.
Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on th... more Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on the current search path, but may revisit the same node again and again. This causes IDA* to take an impractically long time to find a solution. In this paper, we propose a simple and effective algorithm called Stochastic Node Caching (SNC) for reducing the number of revisits. SNC caches a node with the best estimate, which is currently known of the minimum estimated cost from the node to the goal node. Unlike previous related research such as MREC, SNC caches nodes selectively, based on a fixed probability. We demonstrate that SNC can effectively reduce the number of revisits compared to MREC, especially when the state-space forms a lattice.
Indoor air quality (IAQ) in houses is often deteriorated by chemical substances emitted from heat... more Indoor air quality (IAQ) in houses is often deteriorated by chemical substances emitted from heating, building materials, or other household goods. Since it is difficult for occupants to recognize air pollution, they rarely understand the actual conditions of the IAQ. An investigation into the actual condition of IAQ in houses was therefore conducted in this study. Carbon dioxide (CO2) concentrations in 24 occupied houses was measured, and the results from our analysis showed that the use of combustion heaters increased the concentration of CO2 and led to indoor air pollution. Results indicate that as outdoor temperature decreased, the frequency of ventilation decreased simultaneously, and CO2 concentration increased. Results of the questionnaire survey revealed that the actual IAQ in each house did not match the level of awareness its occupants had regarding ventilation. Along with this difficulty in perceiving air pollution, the lack of knowledge about ventilation systems and the ...
With the advance of the Human Genome Project, a huge amount of various genome data has been store... more With the advance of the Human Genome Project, a huge amount of various genome data has been stored in a number of databases and the WWW system is widely used to access these databases. From the viewpoint of information supplier, the WWW is a quite useful tool to provide various types of data easily, but from the viewpoint of information consumer, it is not good enough because of lack of rigid data format and difficulty of data access. In this paper, by extending a current WWW browser, we propose two generic WWW tools; MetaViewer and MetaCommander, and try to apply them to the genome informatics to support researchers who search, analyze, and dispatch genome data, and discuss their potential advantages from the viewpoint of information consumer.
The multiple sequence alignment problem is one of the important problems in Genome Informatics. T... more The multiple sequence alignment problem is one of the important problems in Genome Informatics. The notable feature of this problem is that its state-space forms a lattice. Researchers have applied search algorithms such as A* and memory-bounded search algorithms including SNC to this problem. Unfortunately, previous work could align only seven sequences at most. Korf proposed DCBDS, which exploits the features of a grid, and suggested that DCBDS probably solved this problem, effectively. We found, however, that DCBDS was not effective for aligning many sequences. In this paper, we propose a simple and effective search algorithm, A* with Partial Expansion, for state-spaces with large branching factors. The aim of this algorithm is to store only necessary nodes for findingan optimal solution. In node expansion, A* stores all child nodes, while our algorithm stores only promising child nodes. This mechanism enables us to reduce the memory requirements during a search. We apply our alg...
This paper proposes an in-home behavioral observation method employing Internet of Things (IoT) s... more This paper proposes an in-home behavioral observation method employing Internet of Things (IoT) sensors. Behavioral change programs based on information provision approaches have begun to be employed in the reduction of carbon dioxide emissions in the residential sector. To improve efforts to save energy, a behavioral observation method that aims to understand the reality of users’ daily activities could be an effective approach. However, problems with existing methods include observations costs, privacy implications and the other complications regarding the specific behaviors of the person being observed. An in-home behavioral observation method employing IoT sensors is therefore proposed to both reduce costs and alleviate the privacy impact on user’s in-home activities. The use of sensor-based observation presents several relevant advantages. For example, the cost of sensor-based observation is relatively cheap compared to human-based approaches. In addition, it employs a minimum ...
With linear-storage search, the same nodes are eventually revisited many times because only the s... more With linear-storage search, the same nodes are eventually revisited many times because only the search paths are stored to memory. Some algorithms such as MREC have been proposed to solve this problem by storing nodes as well. MREC is an algorithm that reduces the number of nodes revisited by storing a certain number of nodes located near the root node. Proposed in this paper is stochastic node caching, which involves storing nodes on a probability basis. In so doing, only those nodes are stored that are visited frequently, so that the number of nodes revisited can be reduced efficiently while using limited memory resources. To prove the efficiency of stochastic node caching, this method was compared with MREC while pursuing the same goal. Experiments were performed using the 15-puzzle problem, a typical problem for linear-storage search, and a more complicated problem of gene alignment. The experiments illustrated the properties of the two algorithms, and proved that stochastic node caching is efficient in reducing the number of nodes revisited.
Electronics and Communications in Japan (Part II: Electronics), 2003
With the development of the human genome analysis project, it is becoming possible to utilize lar... more With the development of the human genome analysis project, it is becoming possible to utilize large-scale genome sequence data. One genome analysis method based on large-scale sequence data is genome sequence walking. Applying sequence walking to the segment sequence database, it is possible to estimate the whole sequence of the gene to which the query sequence belongs by using the gene segment. By sequence walking, the researcher can estimate the genome sequence without going through biological experiments. This saves time and expense in sequence determination. Sequence walking has been performed using the well-known BLAST. BLAST, however, is a tool based on similarity search, and is not adequate in sequence walking in which the same gene segments are connected, both from the viewpoint of efficiency and from the viewpoint of accuracy. In this study, it is shown that genome sequence walking is not a problem of similarity search, but is a string matching problem permitting error. A system dedicated to sequence walking is constructed by improving the string matching algorithm, which is more suited to sequence walking. The result has been publicized on the WWW. The proposed sequence walking system can realize sequence walking that is faster and more accurate than the conventional sequence walking by BLAST, thus reducing the burden on the researcher.
In data analysis, the necessary data are not always prepared in a database in advance. If the pre... more In data analysis, the necessary data are not always prepared in a database in advance. If the precision of extracted classification knowledge is not sufficient, gathering additional data is sometimes necessary. Practically, if some critical attributes for the classification are missing from the database, it is very important to identify such missing attributes effectively in order to improve the precision. In this paper, we propose a new method to identify the attributes that will improve the precision of Support Vector Classifiers (SVC) based solely on values of candidate attributes of a very limited number of entities. In experiments, we show the incremental addition of attributes by the proposed method effectively improves the precision of SVC using only a very small number of entities.
The multiple sequence alignment problem is one of the important problems in Genome Informatics. T... more The multiple sequence alignment problem is one of the important problems in Genome Informatics. The notable feature of this problem is that its state-space forms a lattice. Researchers have applied search algorithms such as A* and memory-bounded search algorithms including SNC to this problem. Unfortunately, previous work could align only seven sequences at most. Korf proposed DCBDS, which exploits the features of a grid, and suggested that DCBDS probably solved this problem, effectively. We found, however, that DCBDS was not effective for aligning many sequences. In this paper, we propose a simple and effective search algorithm, A* with Partial Expansion, for state-spaces with large branching factors. The aim of this algorithm is to store only necessary nodes for finding an optimal solution. In node expansion, A* stores all child nodes, while our algorithm stores only promising child nodes. This mechanism enables us to reduce the memory requirements during a search. We apply our algorithm to the multiple sequence alignment problem. It can align seven sequences with only 4.7% of the stored nodes required by A*.
Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on th... more Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on the current search path, but may revisit the same node again and again. This causes IDA* to take an impractically long time to find a solution. In this paper, we propose a simple and effective algorithm called Stochastic Node Caching (SNC) for reducing the number of revisits. SNC caches a node with the best estimate, which is currently known of the minimum estimated cost from the node to the goal node. Unlike previous related research such as MREC, SNC caches nodes selectively, based on a fixed probability. We demonstrate that SNC can effectively reduce the number of revisits compared to MREC, especially when the state-space forms a lattice.
Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on th... more Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on the current search path, but may revisit the same node again and again. This causes IDA* to take an impractically long time to find a solution. In this paper, we propose a simple and effective algorithm called Stochastic Node Caching (SNC) for reducing the number of revisits. SNC caches a node with the best estimate, which is currently known of the minimum estimated cost from the node to the goal node. Unlike previous related research such as MREC, SNC caches nodes selectively, based on a fixed probability. We demonstrate that SNC can effectively reduce the number of revisits compared to MREC, especially when the state-space forms a lattice.
Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on th... more Linear-space search algorithms such as IDA* (Iterative Deepening A*) cache only those nodes on the current search path, but may revisit the same node again and again. This causes IDA* to take an impractically long time to find a solution. In this paper, we propose a simple and effective algorithm called Stochastic Node Caching (SNC) for reducing the number of revisits. SNC caches a node with the best estimate, which is currently known of the minimum estimated cost from the node to the goal node. Unlike previous related research such as MREC, SNC caches nodes selectively, based on a fixed probability. We demonstrate that SNC can effectively reduce the number of revisits compared to MREC, especially when the state-space forms a lattice.
Uploads
Papers by Teruhisa Miura