Removing duplicate indices from mysql database with priority

https://stackoverflow.com/questions/12742043

05-07-2021
|

Вопрос

I am making a procedure for optimizing indices of my databases and it will take me long. So I want to confirm if it is feasible to do? And not already done otherwise want to get it.

I want to read all indices from all tables.
I want to drop all other indices on fields which are primary key
After doing above. The fields which are still indexed as unique. I want to drop all other indices on them
The fields which are neither primary nor unique. I want to keep only one index on each and drop all other indices

I am curious Why MYSQL allows unique index on a primary key field? Can it make any useful difference?

Question : I need guidance/query/procedure to remove all non useful indices with the hierarchy mentioned in my above 4 points

Update 1: I will keep my work updating Here on SQLFiddle. At the moment it is just beginning. However you can see on this link that there are four fields and 8 indices. I want just 3 of them and drop all others. I need only 1st, 3rd and 4th. According to my above 4 points.

Update 2 : I got excellent answer by eggayal. First link he gave is the pure sql solution. I tried it Here at Link2. It gave me undesired output. Output of link2 can be compared by looking into Link1 here

Desired output Is

    COLUMN_NAMES    REDUNDANT_INDEXES
1   auth_id         `auth_id_3`,`auth_id_2`,`auth_id`
2   id              `id_2`,`id_3`
3   subject         `subject_1`

Faults in th output of query in Link2

Row1: auth_id is not shown as redundant index because It has nothing to do (compare) with unique key auth_id_4 on the same field. But I need it because I don't need this index when same column has a unique index as well

Row2 : I will like to say all other indices redundant when there exists primary key index on some column

Row3 : All right

Решение

I have made an application in c#. It removes duplicate indices according to your priority, Hope it would help

It might need much improvements but what I know is... It does not remove only those duplicate indices which are involved in primary(composite) and foreign key at a time (which should not be usually good approach)

Here is download link for complete application with source

Following is the main file of above link

using System;
using System.Data;
using MySql.Data.MySqlClient;
using System.Collections.Generic;

namespace duplicateIndexRemover
{
public class duplicateIndexRemover
{

    static List<string> toDrop;
    public string main(System.Windows.Forms.DataGridView dgv, System.Windows.Forms.DataGridView dgv1, string dbName)
    {            
        try
        {
            toDrop = new List<string>();
            List<table> tbs = new List<table>();
            DataTable dt1 = new DataTable();
            string cnStr = "SERVER=localhost;DATABASE=" + dbName + ";UID=root;";
            MySqlConnection conn = new MySqlConnection(cnStr);
            MySqlCommand cmd = conn.CreateCommand();
            cmd.CommandText = @"SELECT Table_Name,Column_Name,Index_Name,NON_UNIQUE
            FROM information_schema.STATISTICS
            WHERE table_schema = '" + dbName + "' order by Table_Name,Column_Name,Index_Name";
            MySqlDataAdapter adp = new MySqlDataAdapter(cmd);
            DataTable dt = new DataTable();
            adp.Fill(dt);
            dgv.DataSource = dt;
            for (int i = 0; i < dt.Columns.Count - 1; i++)
                dt1.Columns.Add(dt.Columns[i].ColumnName);

            table tb = new table();
            column cl = new column();
            index dx = new index();
            tb.nam = dt.Rows[0][0].ToString();
            cl = addColumn(dt, tb, 0);
            tbs.Add(tb);

            for (int i = 1; i < dt.Rows.Count; i++)
            {
                if (tb.nam != dt.Rows[i][0].ToString())
                {
                    // 1st column of (current) t_th table
                    tb = new table();
                    tb.nam = dt.Rows[i][0].ToString();
                    cl = addColumn(dt, tb, i);
                    tbs.Add(tb);
                }
                else
                {
                    if (cl.nam != dt.Rows[i][1].ToString())
                        cl = addColumn(dt, tb, i);
                    else
                    {

                        // Duplicate Indices
                        // But this one may be primary/unique key
                        // Then it would not be good to make a drop statement for this index here
                        // It may be improvable, but i can not apply as well condition here if it is not primary key
                        addIndex(dt, cl, i);
                    }
                }
            }
            makeDropStatements(tbs, dt1);

            dgv1.DataSource = dt1;
            cmd.Connection.Open();
            for (int i = 0; i < toDrop.Count; i++)
            {
                cmd.CommandText = toDrop[i];
                try
                {
                    cmd.ExecuteScalar();
                }
                catch//(Exception ex)
                {
                    //System.Windows.Forms.MessageBox.Show("Table : " + dt1.Rows[i][0] + " Column : " + dt1.Rows[i][1] + "\n\n" + ex.Message);
                }
            }

            cmd.CommandText = @"select table_name from information_schema.STATISTICS
            WHERE table_schema = '" + dbName + "' group by table_name,column_name";
            DataTable dg = new DataTable();
            adp.Fill(dg);

            string msg = " Total Number of Indices : " + dt.Rows.Count;
            msg += "\t Droppable Indices : " + toDrop.Count;
            msg += "\t Total Number of Indexed Columns : " + dg.Rows.Count;
            return msg;
        }
        catch (Exception ex)
        {
            System.Windows.Forms.MessageBox.Show(ex.Message);
            return ex.Message;
        }
    }

    private static column addColumn(DataTable dt, table tb, int i)
    {
        column cl = new column();
        // 1st index of i_th column of t_th table
        cl.nam = dt.Rows[i][1].ToString();
        addIndex(dt, cl, i);
        tb.cols.Add(cl);
        return cl;
    }

    private static void addIndex(DataTable dt, column cl, int i)
    {
        index dx = new index();
        dx.nam = dt.Rows[i][2].ToString();
        dx.non_unique = Convert.ToBoolean(dt.Rows[i][3]);
        cl.indices.Add(dx);
    }


    private static void makeDropStatements(List<table> tbs, DataTable dt1)
    {
        bool chekd;
        List<index> temp;
        for (int t = 0; t < tbs.Count; t++)
        {                
            for (int i = 0; i < tbs[t].cols.Count; i++)
            {                    
                temp = tbs[t].cols[i].indices;
                if (temp.Count > 1)
                {

                    chekd = false;
                    for (int j = 0; j < temp.Count; j++)
                    {
                        if (temp[j].nam == "PRIMARY")
                        {
                            getToDropIndices(tbs[t].nam, tbs[t].cols[i].nam, temp, j, dt1);
                            chekd = true;
                            break;
                        }
                    }
                    if (!chekd)
                    {
                        for (int j = 0; j < temp.Count; j++)
                        {
                            if (!temp[j].non_unique)
                            {
                                getToDropIndices(tbs[t].nam, tbs[t].cols[i].nam, temp, j, dt1);
                                chekd = true;
                                break;
                            }
                        }
                    }
                    if (!chekd)
                    {
                        getToDropIndices(tbs[t].nam, tbs[t].cols[i].nam, temp, 0, dt1);
                        chekd = true;
                        break;
                    }
                }
            }
        }
    }

    private static void getToDropIndices(string tbl, string col, List<index> sublist, int nt, DataTable dt1)
    {
        for (int j = 0; j < nt; j++)
        {
            toDrop.Add("alter table `" + tbl + "` drop index " + sublist[j].nam);
            dt1.Rows.Add(dt1.NewRow());
            int r = dt1.Rows.Count - 1;
            dt1.Rows[r][0] = tbl;
            dt1.Rows[r][1] = col;
            dt1.Rows[r][2] = sublist[j].nam;
        }
        for (int j = nt + 1; j < sublist.Count; j++)
        {
            toDrop.Add("alter table `" + tbl + "` drop index " + sublist[j].nam);
            dt1.Rows.Add(dt1.NewRow());
            int r = dt1.Rows.Count - 1;
            dt1.Rows[r][0] = tbl;
            dt1.Rows[r][1] = col;
            dt1.Rows[r][2] = sublist[j].nam;
        }
    }
}

public class table
{
    public List<column> cols =new List<column>();
    public string nam = "";
}

public class column
{
    public List<index> indices = new List<index>();
    public string nam = "";
}

public class index
{
    public string nam = "";
    public bool non_unique;
}
}

You may ignore/remove gridviews they are just for showing your indices. You just need to call main function

Другие советы

Roland Bouman wrote a blog article on this very subject, showing a pure SQL solution.

A quick search on Google also churned up:

DuplicateIndexFinder.java;
A similar tool written in Perl that was part of maatkit, now Percona Toolkit; and
mysqlindexanalyzer.

Based on what I see in the link there do seem to be a number of extraneous indexes like these:

KEY `auth_id_2` (`auth_id`),
KEY `auth_id_4` (`auth_id`),
KEY `auth_id_5` (`auth_id`),
KEY `auth_id_6` (`auth_id`)

I thought maybe they were built in different ways. For instance one could be a BTREE and another a HASH, which could potentially serve some purpose. But they're all the same according to SHOW. Removing the extra identical ones should not be an issue.

There is a purpose to having, for instance, all three of these though:

KEY `articleid` (`artid`),
KEY `artSub` (`artSub`),
KEY `comments_ibfk_3` (`artSub`,`artid`)

Building the index on two columns is not necessarily the same as building two separate indexes on one.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow