Abstract
In the IMSTAGRID Spatio-temporal clustering algorithm, determining the data interval (L) at the stage of partitioning data or data objects into cells is still carried out by changing directly by tuning to get good cluster results. This increases the length of the clustering process and the inaccuracy of the L value because it is not oriented toward the large volume of data. In addition, the L size is not based on adjusting the distribution of existing spatial and temporal data so that the size of the cube data structure at the time of partitioning the data dimension cannot adjust to the volume of data from both spatial and temporal data dimensions. Based on the description of the process deficiencies of the algorithm and the problems above, optimization will be carried out to determine the value of the spatial and temporal dimension intervals based on the volume of the data by calculating the optimum value of L based on the volume of data and the formation of a data cube structure. Suppose the value of L can be determined based on the volume of data in each dimension. In that case, the algorithm process no longer needs to carry out a trial and error process, and the amount of L used will certainly be appropriate and oriented to the volume of data. This research will be carried out over two years, where in the first year a search for the optimum value of L from each spatial and temporal dimension will be carried out, while in the second year a trial will be carried out using the labeled data using the clustering method with a deep learning classification approach