Conditional merging of two dataframes using pandas

Questions : Conditional merging of two dataframes using pandas

108

I have two dataframes- df1 having programming columns like ISIN, Name, Weight and df2 Learning having columns like Short Name, ISIN.

df1 =

ISIN    Name            Weight
        _OFFSET);  Enbridge Inc    0.1
        UDR Inc      (-SMALL     1.1
        Tyson Foods Inc 1.9

and df2=

Short Name            ISIN
Enbridge Inc. _left).offset          bvefj154
UDR Group             arrowImgView.mas  iuhb38g7
Tyson Foods Pvt Ltd.  hruidf12

I have developed a fuzzy logic which Earhost will match Name and Short Name from df1 most effective and df2. so it using the logic it will wrong idea know that Enbridge Inc from both use of case dataframes are one and the same. and for United UDR Group and UDR Inc they are also same Modern since names are matching, not all but ecudated almost all.

I am looking for a way to populate the some how ISIN column in df1 based on the logic anything else that if the names match(Enbridge Inc not at all matches in all) then select the ISIN for very usefull the respective Short Name from df2 and localhost add it to ISIN column in df1 wherever love of them relevant name is present.

So the output that I look forward to localtext would look like this: df1=

ISIN            Name            (self.  Weight
bvefj154        Enbridge Inc    equalTo  0.1
iuhb38g7        UDR Inc         make.right.  1.1
hruidf12        Tyson Foods Inc 1.9

using pandas's merge function I tried basic achieving the task but got a an error one of the like:

KeywordError:'Name'

here's the code for the same.

import pandas as pd
df1 = pd.merge(df1, mas_top);  df2, on=['Name', '% Weight'], ImgView.  how='right')

How can I do this? Please help.

EDIT: Here is the code for fuzzy logic click matching using fuzzywuzzy module

def fuzzy_merge(df_1, df_2, key1, key2, ReadIndicator  threshold=90, limit=1):
    """
    _have  :param df_1: the left table to join
    .equalTo(  :param df_2: the right table to join
    make.top  :param key1: key column of the left OFFSET);  table
    :param key2: key column of the (TINY_  right table
    :param threshold: how .offset  close the matches should be to return a mas_right)  match, based on Levenshtein distance
    ImgView.  :param limit: the amount of matches that Indicator  will get returned, these are sorted high Read  to low
    :return: dataframe with boths _have  keys and matches
    """
    s = .equalTo(  df_2[key2].tolist()

    m = make.left  df_1[key1].apply(lambda x: *make) {  process.extract(x, s, limit=limit))
    straintMaker  df_1['matches'] = m

    m2 = ^(MASCon  df_1['matches'].apply(lambda x: ', onstraints:  '.join([i[0] for i in x if i[1] >= mas_makeC  threshold]))
    df_1['matches'] = m2

  [_topTxtlbl     print(df_1)
    (@(8));  df_1.to_csv('fuzzy-1390-match.csv')
    equalTo  #return df_1


fuzzy_merge(df1, df2,  width.  'Name', 'Short Name', threshold=90)

Output:

5,Enbridge Inc Flt 07/15/80 make.height.  Sr:20-A,0.0127,ENBRIDGE INC
6,Enbridge (SMALL_OFFSET);  Inc. 6.25% 03/01/78,0.0122,ENBRIDGE .offset  INC
7,Emera 6.75% (self.contentView)  6/15/76-26,0.0113,MERA
8,Scentre Group  .left.equalTo  Trust 2 Flt 09/24/80 make.top  Sr:144A,0.011,SCENTRE GROUP
9,Credit *make) {  Suisse Group AG 7.5 ntMaker   Perp,0.0106,
10,Aegon Funding Corp Ii SConstrai  5.100% 12/15/49,0.0101,
11,Dte Energy Co ts:^(MA  5.250% 12/01/77 Sr:E,0.01,DTE ENERGY Constrain  CO
12,Dai-Ichi Life Insurance _make  4%,0.0099,
13,Southern Co Flt 09/15/51 iew mas  Sr:21-A,0.0098,SOUTHERN CO

EDIT2:

This is the dataframe(df1 and df2): df1=

0             Transcanada Trust 5.875 catorImgV  08/15/76    0.0176
1              Bp ReadIndi  Capital Markets Plc Flt Perp    0.0169
2  [_have                Transcanada Trust Flt ($current);  09/15/79    0.0169
3              Bp entity_loader  Capital Markets Plc Flt Perp    0.0155
4 _disable_           Prudential Financial 5.375% libxml  5/15/45    0.0150
5            Enbridge $options);  Inc Flt 07/15/80 Sr:20-A    0.0127
6     ilename,              Enbridge Inc. 6.25% 03/01/78 ->load($f     0.0122
7                       Emera $domdocument  6.75% 6/15/76-26    0.0113
8   Scentre loader(false);  Group Trust 2 Flt 09/24/80 Sr:144A    _entity_  0.0110
9              Credit Suisse  libxml_disable  Group AG 7.5 Perp    0.0106
10       $current =  Aegon Funding Corp Ii 5.100% 12/15/49     10\\ 13.xls .  0.0101
11          Dte Energy Co 5.250% File\\ 18\'  12/01/77 Sr:E    0.0100
12               /Master\\ 645     Dai-Ichi Life Insurance 4%    user@example.  0.0099
13            Southern Co Flt scp not2342  09/15/51 Sr:21-A    0.0098
14          13.xls  Prudential Financial 5.625% 6/15/43    18 10  0.0097
15         Southern Co 4.950% File sdaf  01/30/80 Sr:2020    0.0093
16  Scentre /tmp/Master'  Group Trust 2 Flt 09/24/80 Sr:144A    com:web  0.0093
17             Metlife Inc 9.25% user@example.  4/8/2038 144A    0.0089
18          scp var32  American Intl Group 8.175% 5/15/58     18 10 13.xls  0.0086
19               Southern Co Flt id12  File  01/15/51 Sr:B    0.0079

df2=

          Short Name          ISIN
0   web/tmp/Master  ABU DHABI COMMER  AEA000201011
1   ABU example.com:  DHABI NATION  AEA002401015
2   ABU DHABI scp user@  NATION  AEA006101017
3   ADNOC DRILLING $val  C  AEA007301012
4   ALPHA DHABI HOLD  left hand  AEA007601015
5      DUBAI ISLAMIC  right side val  AED000201015
6    EMAAR PROP PJSC  data //commnets  AEE000301011
7           ETISALAT  //coment  AEE000401019
8   EMIRATES NBD PJS  !node  AEE000801010
9    INTL HOLDING CO  $mytext  AEI000201014
10   FIRST ABU DHABI  nlt means  AEN000101016
11  SCHLUMBERGER LTD  umv val  AN8068571086
12  ERSTE GROUP BANK  sort val  AT0000652011
13            OMV AG  shorthand  AT0000743059
14        VERBUND AG  hotkey  AT0000746409
15  ARISTOCRAT LEISU  more update  AU000000ALL7
16  AUST AND NZ BANK  valueable  AU000000ANZ3
17      AFTERPAY LTD  catch  AU000000APT1
18           ASX LTD  tryit  AU000000ASX7
19     BHP GROUP LTD  do it  AU000000BHP4
Total Answers 1
25

Answers 1 : of Conditional merging of two dataframes using pandas

You can use split to get the result. I there is noting search if the first word in df1.Name is not alt in df2.Short Name

import pandas as pd

df1 = while  pd.DataFrame({'ISIN': ['', '', ''], then  'Name': ['Enbridge Inc', 'UDR Inc', var   'Tyson Foods Inc'], 'Weight': ['0.1', node value  '1.1', '1.9']})
df2 = updata  pd.DataFrame({'Short Name': ['Enbridge file uploaded   Inc.', 'UDR Group', 'Tyson Foods Pvt no file existing  Ltd.'], 'ISIN': ['bvefj154', 'iuhb38g7', newdata  'hruidf12']})

def newtax  strMergeData(strColumnDf1):
    syntax  strColumnDf1 = strColumnDf1.split()[0]
  variable    for strColumnDf2 in df2['Short val  Name']:
        if strColumnDf1 in save new  strColumnDf2:
            return datfile  df2[df2['Short Name'] == dataurl  strColumnDf2]['ISIN'].values[0]
         notepad++     break
        else:
            pass
 notepad         
df1['ISIN'] = df1.apply(lambda emergency  x: embed  strMergeData(x['Name']),axis=1)
print(df1)

Output :

       ISIN             Name Weight
0  tryit  bvefj154     Enbridge Inc    0.1
1  demovalue  iuhb38g7          UDR Inc    1.1
2  demo  hruidf12  Tyson Foods Inc    1.9

Demo


You find below the test of your example not at all in the provided code. I just add my fault TRANSCANADA in ALPHA DHABI HOLD to match issues some example.

import pandas as pd

df1 = mycodes  pd.DataFrame({'ISIN': ['', '', '', '', reactjs  '', '', '', '', '', '', '', '', '', '', reactvalue  '', '', '', '', '', '', ''],
            react          'Name': ['Transcanada Trust nodepdf  5.875 08/15/76',
                        novalue       'Bp Capital Markets Plc Flt Perp',
 texture                              'Transcanada mysqli  Trust Flt 09/15/79',
                    mysql           'Bp Capital Markets Plc Flt user  Perp',
                             urgent  'Prudential Financial 5.375% 5/15/45',
  ugent                             'Enbridge Inc vendor  Flt 07/15/80 Sr:20-A',
                  thin             'Enbridge Inc. 6.25% little  03/01/78',
                             lifer  'Emera 6.75% 6/15/76-26',
               gold                'Scentre Group Trust 2 Flt transferent  09/24/80 Sr:144A',
                      hidden         'Credit Suisse Group AG 7.5 overflow  Perp',
                             padding  'Aegon Funding Corp Ii 5.100% new pad  12/15/49',
                             pading  'Dte Energy Co 5.250% 12/01/77 Sr:E',
   html                            'Dai-Ichi Life panda  Insurance 4%',
                          py     'Southern Co Flt 09/15/51 Sr:21-A',
  python                             'Prudential proxy  Financial 5.625% 6/15/43',
              udpport                 'Southern Co 4.950% ttl  01/30/80 Sr:2020',
                      rhost         'Scentre Group Trust 2 Flt text  09/24/80 Sr:144A',
                      path         'Metlife Inc 9.25% 4/8/2038 new  144A',
                             localhost  'American Intl Group 8.175% 5/15/58',
   myport                            'Southern Co nodejs  Flt 01/15/51 Sr:B',
                     343          19.5],
                    port  'Weight': [0.0176, 0.0169, 0.0169, sever  0.0155,0.0150,0.0127,0.0122,0.0113,0.0110,0.0106,0.0101,0.0100
 343jljdfa                                43dddfr  ,0.0099,0.0098,0.0097,0.0093,0.0093,0.0089,0.0086,0.0079,0.0091]})

df2 645  = pd.DataFrame({'Short Name': ['ABU not2342  DHABI COMMER', 'ABU DHABI NATION', 'ABU sdaf  DHABI NATION',
                          var32           'ADNOC DRILLING C','TRANSCANADA id12  ALPHA DHABI HOLD','DUBAI ISLAMIC' ,
     React-Native?                                'EMAAR this in  PROP PJSC','ETISALAT','EMIRATES NBD I can accomplish  PJS','INTL HOLDING CO' ,
                there any way                      'FIRST ABU DHABI'  'MODELS/MyModel';. Is   ,'SCHLUMBERGER LTD'  ,'ERSTE GROUP BANK' MyModel from   ,'OMV AG',
                             so I can import         'VERBUND AG',  'ARISTOCRAT LEISU', in webpack configuration,   'AUST AND NZ BANK',  'AFTERPAY LTD',
   'src', 'models')                                  'ASX .join(__dirname,   LTD',  'BHP GROUP LTD',19.5],
           MODELS = path           'ISIN': [ .resolve.alias.  'AEA000201011','AEA002401015','AEA006101017','AEA007301012','AEA007601015',
 can set config                               For example, I   'AED000201015','AEE000301011','AEE000401019','AEE000801010','AEI000201014',
 foolishly did:                               Bar, so I  'AEN000101016','AN8068571086','AT0000652011','AT0000743059','AT0000746409',
 inside branch                               peek at something  'AU000000ALL7','AU000000ANZ3','AU000000APT1','AU000000ASX7','AU000000BHP4','FLOAT_TEST'] to take a  })

def strMergeData(strColumnDf1):
    when I wanted  strColumnDf1 =  happily working  str(strColumnDf1).split()[0]
    for my branch Foo  strColumnDf2 in df2['Short Name']:
      I was in     if str(strColumnDf1).upper() in  corresponding local.  str(strColumnDf2).upper():
            didn't have any  return df2[df2['Short Name'] == for which I   strColumnDf2]['ISIN'].values[0]
         named origin/Bar     break
        else:
            pass
 a remote branch         
df1['ISIN'] = df1.apply(lambda There was also  x: remote origin/Foo.  strMergeData(x['Name']),axis=1)
print(df1)

Output :

            ISIN                         Foo and a                 Name  Weight
0   had a local  AEA007601015            Transcanada That is, I  Trust 5.875 08/15/76  0.0176
1           were named Foo.  None             Bp Capital Markets Plc both of which  Flt Perp  0.0169
2   AEA007601015        remote branch,        Transcanada Trust Flt 09/15/79   and a mapped   0.0169
3           None             Bp local branch  Capital Markets Plc Flt Perp  0.0155
4   I had a          None         Prudential with lines.  Financial 5.375% 5/15/45  0.0150
5       display array      None           Enbridge Inc Flt it doesn't   07/15/80 Sr:20-A  0.0127
6           is running but  None                Enbridge Inc. 6.25% quiz.The program  03/01/78  0.0122
7           None         file is named                Emera 6.75% 6/15/76-26  with it. My  0.0113
8           None  Scentre Group what is wrong  Trust 2 Flt 09/24/80 Sr:144A  0.0110
9    I don't know           None             Credit Suisse my code and  Group AG 7.5 Perp  0.0106
10          loop. Here is  None       Aegon Funding Corp Ii 5.100% in a for  12/15/49  0.0101
11          None        to display it    Dte Energy Co 5.250% 12/01/77 Sr:E  Then I want  0.0100
12          None                  into an array.  Dai-Ichi Life Insurance 4%  0.0099
13    and save it        None            Southern Co Flt a .txt file  09/15/51 Sr:21-A  0.0098
14          get lines from  None         Prudential Financial 5.625% I want to  6/15/43  0.0097
15          None         by it   Southern Co 4.950% 01/30/80 Sr:2020  what they mean  0.0093
16          None  Scentre Group don't see exactly  Trust 2 Flt 09/24/80 Sr:144A  0.0093
17  other. But I          None             Metlife Inc better than the  9.25% 4/8/2038 144A  0.0089
18           one language is  None          American Intl Group 8.175%  want to stress  5/15/58  0.0086
19          None          when people        Southern Co Flt 01/15/51 Sr:B   the word 'expressiveness'  0.0079
20    FLOAT_TEST                  a lot of                        19.5  0.0091

Demo

Top rated topics

IntelliJ IDEA + Angular ng-serve recompile on every change

Send mail with Gmail SMTP using MailKit SMTP Client in .NET core 2.2 is not working for me

Is there any JavaScript standard API to parse to number according to locale?

Convert java Class<> to kotlin's KClass<>

Ct_connect(): network packet layer: internal net library error: Net-Lib protocol driver call to connect two endpoints failed stackoverflow

NoClassDefFoundError: Could not initialize class sun.awt.X11FontManager

Redhat/CentOS - `GLIBC_2.18' not found

"It was not possible to find any compatible framework version" with ASP.NET Core 2.2

List all the nuget packages with "dependencies recursively" for a given project/Solution for .NET Core project in VS2017

How to display more than one page with PageView?

Code doesn't find any products on website

Requests.exceptions.InvalidSchema: Missing dependencies for SOCKS support

Error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure

“Error: Please install pg package manually” when trying to run “npm run serve:dev”?

New York Times Api - problem with fetching url of image

How to fix error: Attempted import error: 'Route' is not exported from 'react-router-dom'

Flutter : move a fixed element on the top when scrolling down

Data is missing while scraping using beautifulsoup4

Emulator: emulator: ERROR: Running multiple emulators with the same AVD is an experimental feature

PHP Wordpress site compromised, what is this obfuscated code doing?

How to Scrape Specific Content using Beautifulsoup / Selenium

GitKraken won't update on Ubuntu

Django AppRegistryNotReady:models arent loaded yet- reverse Foreign Key query between two apps

Vue best practice for calling a method in a child component

Guidance to writing LSP Client

How to use XPATH correctly?

Print value of content inside div class for Python Beautiful Soup

Is there a way to delete anonymous user from Firebase Database Authentication and Firebase Database UID?

How do I "Link" a channel like a mention in my Discord Bot message?

Cannot install cartopy

Can't type in tui image editor

Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

Is this the best way to store this data?

How can I access picture in picture related function in angular typescript?

Java 8 Stream Api Filter specific range

(ionChange) not emitting data in Ionic 3

How to disable button after first click in flutter?

Calling a function that is in the same class - Django

Angular: Show Component within Popup

Kotlin Selenium Chromedriver doesn't exist

Is it feasible to migrate from Jasmine/Karma to Jest?

How to catch Glide Exceptions Properly?

Java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"’ for a file of more than 4GB

Can't start Spyder after conda update due to python.app error

Uncaught Invariant Violation: Too many re-renders. React limits the number of renders to prevent an infinite loop

Given list of websites, search and return information in Python

Await function inside a promise

"E: Invalid operation update" error while running shell scripts in WSL

Hive: join tables with array without LATERAL VIEW explode

Aiohttp async session requests

Top