Seek for duplicates in an array

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
Albireo
Posts: 1748
Joined: 16 Oct 2013, 13:53

Seek for duplicates in an array

21 Feb 2018, 04:28

Is this the only way to seek for duplicates in an array?

Code: Select all

Delimiter := ";"
Record = "Field1";"Field2";"Field3";Field1;"Field5"
TempArray :=  StrSplit(Record, Delimiter, "`t`n`r""""")

For ID, FieldArray In TempArray
{	Loop % TempArray.Length() - ID
	{	If (TempArray[ID] = TempArray[ID + A_Index] )
			MsgBox 64, Row.: %A_LineNumber% -> %A_ScriptName%, % "Fields to handle .: " TempArray[ID] "`nAvailable in at least two places!`n( Columns .: " ID " and " ID + A_Index " )"
		
	}
}
MsgBox 64, Row.: %A_LineNumber% -> %A_ScriptName%, Done!, 2
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 08:10

Hi,

Should Field1 and "Field1" be considered as duplicates? Strictly speaking, they are not.
Albireo
Posts: 1748
Joined: 16 Oct 2013, 13:53

Re: Seek for duplicates in an array

21 Feb 2018, 08:29

Yes!
Field1 is on at least two places in the example (can be more), and if it is so, the "problem" must be handled.
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 10:05

Try:

Code: Select all

Delimiter := ";"
Record = "Field1";"Field2";"Field3";Field1;"Field5"
TempArray :=  StrSplit(Record, Delimiter, "`t`n`r""""")

duplicates := GetDuplicates(TempArray)
for k, v in duplicates
   str .= (str = "" ? "" : "`n") . k . " available in " . v.count . " places! Columns: " . v.pos
MsgBox, % str

GetDuplicates(testArr)  {
   arr := SortArray(testArr)
   duplicates := {}
   
   testItem := arr[1]
   Loop % ObjLength(arr) - 1  {
      key := RegExReplace(testItem, "\[\d+\]$", "$1")
      if ( key != RegExReplace(arr[A_Index + 1], "\[\d+\]$") )
         testItem := arr[A_Index + 1]
      else  {
         if !ObjHasKey(duplicates, key)
            duplicates[key] := { count: 1, pos: RegExReplace(testItem, ".*\[(\d+)\]$", "$1") }
         duplicates[key].count++
         duplicates[key].pos .= "," . RegExReplace(arr[A_Index + 1], ".*\[(\d+)\]$", "$1")
      }
   }
   Return duplicates
}

SortArray(testArr)  {
   arr := []
   for k, v in testArr
      arr.Push(v . "[" . k . "]")
   
   length := ObjLength(arr)
   Loop % length - 1  {
      i := A_Index
      Loop % length - i  {
         if ( RegExReplace(arr[A_Index], "\[\d+\]$") > RegExReplace(arr[A_Index + 1], "\[\d+\]$") )  {
            tmp := arr[A_Index + 1]
            arr[A_Index + 1] := arr[A_Index]
            arr[A_Index] := tmp
         }
      }
   }
   Return arr
}
Last edited by teadrinker on 21 Feb 2018, 10:27, edited 3 times in total.
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 10:44

If the matching should be case sensitive:

Code: Select all

Delimiter := ";"
Record = "Field1";"Field2";"Field3";Field1;"Field5;field1"
TempArray :=  StrSplit(Record, Delimiter, "`t`n`r""""")

duplicates := GetDuplicates(TempArray)
for k, v in duplicates
   str .= (str = "" ? "" : "`n") . k . " available in " . v.count . " places! Columns: " . v.pos
MsgBox, % str

GetDuplicates(testArr)  {
   arr := SortArray(testArr)
   duplicates := {}
   
   testItem := arr[1]
   Loop % ObjLength(arr) - 1  {
      key := RegExReplace(testItem, "\[\d+\]$", "$1")
      if !( key == RegExReplace(arr[A_Index + 1], "\[\d+\]$") )
         testItem := arr[A_Index + 1]
      else  {
         if !ObjHasKey(duplicates, key)
            duplicates[key] := { count: 1, pos: RegExReplace(testItem, ".*\[(\d+)\]$", "$1") }
         duplicates[key].count++
         duplicates[key].pos .= "," . RegExReplace(arr[A_Index + 1], ".*\[(\d+)\]$", "$1")
      }
   }
   Return duplicates
}

SortArray(testArr)  {
   arr := []
   for k, v in testArr
      arr.Push(v . "[" . k . "]")
   
   length := ObjLength(arr)
   Loop % length - 1  {
      i := A_Index
      Loop % length - i  {
         if ( RegExReplace(arr[A_Index], "\[\d+\]$") > RegExReplace(arr[A_Index + 1], "\[\d+\]$") )  {
            tmp := arr[A_Index + 1]
            arr[A_Index + 1] := arr[A_Index]
            arr[A_Index] := tmp
         }
      }
   }
   Return arr
}
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 11:04

No, sorry, case sensivive example doesn't work properly. :)
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 11:44

This works:

Code: Select all

Delimiter := ";"
Record = "Field1";"Field2";"field1";field1;Field1;"Field6";field1
TempArray :=  StrSplit(Record, Delimiter, "`t`n`r""""")

duplicates := GetDuplicates(TempArray)
for k, v in duplicates
   str .= (str = "" ? "" : "`n") . k . " available in " . v.count . " places! Columns: " . v.pos
MsgBox, % str

GetDuplicates(testArr)  {
   arr := SortArray(testArr)
   duplicates := CSobj()
   
   testItem := arr[1]
   Loop % ObjLength(arr) - 1  {
      key := RegExReplace(testItem, "\[\d+\]$", "$1")
      if !( key == RegExReplace(arr[A_Index + 1], "\[\d+\]$") )
         testItem := arr[A_Index + 1]
      else  {
         if !duplicates.HasKey(key)
            duplicates[key] := { count: 1, pos: RegExReplace(testItem, ".*\[(\d+)\]$", "$1") }
         duplicates[key].count++
         duplicates[key].pos .= "," . RegExReplace(arr[A_Index + 1], ".*\[(\d+)\]$", "$1")
      }
   }
   Return duplicates
}

SortArray(testArr)  {
   StringCaseSense, On
   arr := []
   for k, v in testArr
      arr.Push(v . "[" . k . "]")
   
   length := ObjLength(arr)
   Loop % length - 1  {
      i := A_Index
      Loop % length - i  {
         if ( RegExReplace(arr[A_Index], "\[\d+\]$") > RegExReplace(arr[A_Index + 1], "\[\d+\]$") )  {
            tmp := arr[A_Index + 1]
            arr[A_Index + 1] := arr[A_Index]
            arr[A_Index] := tmp
         }
      }
   }
   Return arr
}

CSobj() {
   static base := object("_NewEnum","__NewEnum", "Next","__Next", "__Set","__Setter", "__Get","__Getter", "__Call","__Caller")
   return, object("__sd_obj__", ComObjCreate("Scripting.Dictionary"), "base", base)
}
__Getter(self, key) {
   return, self.__sd_obj__.item("" key)
}
__Setter(self, key, value) {
   self.__sd_obj__.item("" key) := value
   return, false
}
__NewEnum(self) {
   return, self
}
__Next(self, ByRef key = "", ByRef val = "") {
   static Enum
   if not Enum
      Enum := self.__sd_obj__._NewEnum
   if Not Enum[key], val:=self[key]
      return, Enum:=false
   return, true
}
__Caller(self, name, value) {
   if (name = "count")
      return, self.__sd_obj__.count
   if (name = "HasKey")
      return, self.__sd_obj__.Exists(value)
}
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 15:47

An easier way:

Code: Select all

Delimiter := ";"
Record = "Field1";"Field2";"field1";field1;Field1;"Field6";field1
TempArray :=  StrSplit(Record, Delimiter, "`t`n`r""""")

duplicates := GetDuplicates(TempArray)  ; not case sensitive
for k, v in duplicates
   str .= (str = "" ? "" : "`n") . k . " available in " . v.count . " places! Columns: " . v.pos
MsgBox,, Not case sensitive, % str

str := ""
duplicates := GetDuplicates(TempArray, true)  ; case sensitive
for k, v in duplicates
   str .= (str = "" ? "" : "`n") . k . " available in " . v.count . " places! Columns: " . v.pos
MsgBox,, Case sensitive, % str

GetDuplicates(testArr, caseSense := false)  {
   duplicates := caseSense ? CSobj() : {}
   for k, v in testArr  {
      if duplicates.HasKey(v)
         continue
      Loop % testArr.length() - k  {
         if (caseSense && v == testArr[k + A_Index]) || (!caseSense && v = testArr[k + A_Index])  {
            (!duplicates.HasKey(v) && duplicates[v] := {count: 1, pos: k})
            duplicates[v].count++, duplicates[v].pos .= "," . k + A_Index
         }
      }
   }
   Return duplicates
}
   
CSobj() {
   static base := object("_NewEnum","__NewEnum", "Next","__Next", "__Set","__Setter", "__Get","__Getter", "__Call","__Caller")
   return, object("__sd_obj__", ComObjCreate("Scripting.Dictionary"), "base", base)
}
__Getter(self, key) {
   return, self.__sd_obj__.item("" key)
}
__Setter(self, key, value) {
   self.__sd_obj__.item("" key) := value
   return, false
}
__NewEnum(self) {
   return, self
}
__Next(self, ByRef key = "", ByRef val = "") {
   static Enum
   if not Enum
      Enum := self.__sd_obj__._NewEnum
   if Not Enum[key], val:=self[key]
      return, Enum:=false
   return, true
}
__Caller(self, name, value) {
   if (name = "count")
      return, self.__sd_obj__.count
   if (name = "HasKey")
      return, self.__sd_obj__.Exists(value)
}
Albireo
Posts: 1748
Joined: 16 Oct 2013, 13:53

Re: Seek for duplicates in an array

21 Feb 2018, 17:27

Easier! :? :? (I'm impressed)
Can you explain a little what happen?
(I did not think it was so complicated ...)
//Jan
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Seek for duplicates in an array

21 Feb 2018, 17:48

:) If you don't need the case sensitivity, the code could look simplier:

Code: Select all

Delimiter := ";"
Record = "Field1";"Field2";"field1";field1;Field1;"Field6";field1
TempArray :=  StrSplit(Record, Delimiter, "`t`n`r""""")

duplicates := GetDuplicates(TempArray)  ; not case sensitive
for k, v in duplicates
   str .= (str = "" ? "" : "`n") . k . " available in " . v.count . " places! Columns: " . v.pos
MsgBox,, Not case sensitive, % str

GetDuplicates(testArr)  {
   duplicates := {}
   for k, v in testArr  {
      if duplicates.HasKey(v)
         continue
      Loop % testArr.length() - k  {
         if (v = testArr[k + A_Index])  {
            (!duplicates.HasKey(v) && duplicates[v] := {count: 1, pos: k})
            duplicates[v].count++, duplicates[v].pos .= "," . k + A_Index
         }
      }
   }
   Return duplicates
}
I just match keys one by one.
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: Seek for duplicates in an array

21 Feb 2018, 19:55

Here are two examples for removing duplicates.

Code: Select all

q:: ;remove duplicates (case insensitive)
vText := "abc,def,ghi,ABC,def"
oArray := {}
vOutput := ""
Loop, Parse, vText, % ","
{
	if !oArray.HasKey("z" A_LoopField)
	{
		oArray["z" A_LoopField] := ""
		vOutput .= A_LoopField "`r`n"
	}
}
MsgBox, % vOutput
return

w:: ;remove duplicates (case sensitive)
vText := "abc,def,ghi,ABC,def"
oArray := ComObjCreate("Scripting.Dictionary")
vOutput := ""
Loop, Parse, vText, % ","
{
	if !oArray.Exists("" A_LoopField)
	{
		oArray.Item("" A_LoopField) := ""
		vOutput .= A_LoopField "`r`n"
	}
}
MsgBox, % vOutput
return
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Freddie, gongnl, mmflume, OrangeCat, ShatterCoder and 91 guests