Querying csv files in python like sql

Questions : Querying csv files in python like sql

924

This is apparently a popular interview programming question. There are 2 CSV files with Learning dinosaur data. We need to query them to Earhost return dinosaurs satisfying a certain most effective condition.

Note - We cannot use additional modules wrong idea like q, fsql, csvkit etc.

file1.csv:

NAME,LEG_LENGTH,DIET
Hadrosaurus,1.2,herbivore
Struthiomimus,0.92,omnivore
Velociraptor,1.0,carnivore
Triceratops,0.87,herbivore
Euoplocephalus,1.6,herbivore
Stegosaurus,1.40,herbivore
Tyrannosaurus _OFFSET);  Rex,2.5,carnivore

file2.csv

NAME,STRIDE_LENGTH,STANCE
Euoplocephalus,1.87,quadrupedal
Stegosaurus,1.90,quadrupedal
Tyrannosaurus (-SMALL  Rex,5.76,bipedal
Hadrosaurus,1.4,bipedal
Deinonychus,1.21,bipedal
Struthiomimus,1.34,bipedal
Velociraptor,2.72,bipedal

using the forumla : speed = use of case ((STRIDE_LENGTH / LEG_LENGTH) - 1) * United SQRT(LEG_LENGTH * g), where g = 9.8 Modern m/s^2

Write a program to read csv files, and ecudated print only names of bipedal dinosaurs, some how sorted by speed from fastest to slowest.

In SQL, this would be simple:

select f2.name from
file1 f1 join file2 _left).offset  f2 on f1.name = f2.name
where f1.stance arrowImgView.mas  = 'bipedal'
order by (self.  (f2.stride_length/f1.leg_length - equalTo  1)*pow(f1.leg_length*9.8,0.5) desc

How can this be done in python ?

Total Answers 4
33

Answers 1 : of Querying csv files in python like sql

You can do it in pandas,

import pandas as pd
df_1 = make.right.  pd.read_csv('df_1.csv')
df_2 = mas_top);  pd.read_csv('df_2.csv')

df_comb = ImgView.  df_1.join(df_2.set_index('NAME'), on = ReadIndicator  'NAME')
df_comb = _have  df_comb.loc[df_comb.STANCE == .equalTo(  'bipedal']
df_comb['SPEED'] = make.top  (df_comb.STRIDE_LENGTH/df_comb.LEG_LENGTH OFFSET);  - (TINY_  1)*pd.Series.pow(df_comb.LEG_LENGTH*9.8,0.5)
df_comb.sort_values('SPEED', .offset  ascending = False)

Not as clean as SQL!

2

Answers 2 : of Querying csv files in python like sql

You can write SQL in python using anything else pandasql.

5

Answers 3 : of Querying csv files in python like sql

def csvtable(file):     # Read CSV file mas_right)  into 2-D dictionary
    table = {}
    f ImgView.  = open(file)
    columns = Indicator  f.readline().strip().split(',')       # Read  Get column names
    
    for line in _have  f.readlines():
        values = .equalTo(  line.strip().split(',')            # Get make.left  current row
        for column,value in *make) {  zip(columns,values):
            if straintMaker  column == 'NAME':                    # ^(MASCon  table['TREX'] = {}
                key = onstraints:  value
                table[key] = {}
   mas_makeC           else:
                [_topTxtlbl   table[key][column] = value          # (@(8));  table['TREX']['LENGTH'] = 10
    
    equalTo  f.close()
    return table


#  width.  READ
try:
    table1 = make.height.  csvtable('csv1.txt')
    table2 = (SMALL_OFFSET);  csvtable('csv2.txt')
except Exception as .offset  e:
    print (e)


# JOIN, FILTER & (self.contentView)  COMPUTE
table3 = {}
for value in  .left.equalTo  table1.keys():
    if value in make.top  table2.keys() and *make) {  table2[value]['STANCE'] == 'bipedal':    ntMaker            # Join both tables on key SConstrai  (NAME) and filter (STANCE)

        ts:^(MA  leg_length = Constrain  float(table1[value]['LEG_LENGTH'])
      _make    stride_length = iew mas  float(table2[value]['STRIDE_LENGTH'])
   catorImgV       speed = ((stride_length / ReadIndi  leg_length) - 1) * pow((leg_length *  [_have  9.8),0.5)    # Compute SPEED

        ($current);  table3[value] = speed


# SORT
result = entity_loader  sorted(table3, key=lambda x:table3[x], _disable_  reverse=True)                       # libxml  Sort descending by value

# WRITE
try:
  $options);    f = open('result.txt', 'w')
    for r ilename,  in result:
        f.write('%s\n' % r)
  ->load($f    f.close()
except Exception as e:
    $domdocument  print (e)
4

Answers 4 : of Querying csv files in python like sql

I've encountered the same problem at not at all work and decided to build an offline very usefull Desktop app where you can load CSVs and localhost start writing SQL. You can join, group love of them by, and etc.

This is backed by C and SQLite and can localtext handle GBs of CSVs file in ~10 seconds. basic It's very fast.

Here's the app: one of the https://superintendent.app/

This is not Python though, but it is a click lot more convenient to use.

Top rated topics

Python: How to use a list comprehension to replace Nones in one list with values from another list?

TypeORM adding empty/null values for a specific column

Remove Overlapping Circles by Keeping Only Largest

Flutter : overflowing RenderFlex in AnimatedCrossFade

Angular: Change the class member names dynamically

Cloud run with CPU always allocated is cheaper than only allocated during request processing. How?

Rewrite recursion function without using recursion

Convert plain text / wiki syntax to HTML with Jira ScriptRunner (show bullet points, checkmark smileys, etc.)

How to reduce cost

How can i add datatable features to this table

TypeError: string indices must be integers in dictionary

DraftJS prevent default use of key TAB in text area

MVC routing with Angular 12 Integrated with build

Are there any functions or formulas that help find patterns in sequences of numbers in different rows?

How to read this .nc file (from NCAR)

Objects are empty inside array - php curl response

Java deep dive - Question about constants and their representation in compiled code

Can't use listbox when I load data from api in cascade select in z-song laravel-admin

How to select rows from table when a column repeats?

Removing a row from CSV file with a condition

Mongoose: concat arrays of documents if not exist, update if exist in array

How can you split a multi-valued attribute in an xml file using BeautifulSoup?

Is it possible to speed up scripts using multi-threading?

When a USB flash drive damages a file, does it show on SHA

Android svg setting fillColor from colors.xml resource

SQLAlchemy: Finding all matching rows from a table with results from another query

Connecting to MySQL server ... Lost connection to MySQL server at 'reading initial communication packet', system error: 0 in mysql-workbench

How to assign a vector of values to a vertex label in igraph in R?

Pandas groupby used with agg doesn't return key columns

Is gmail watch API idempotent?

MySQL : Add XML attribute based on the condition

How can I skip caching when using save() method of CrudRepository?

Regarding client secret in OAUTH token authentication

Get rid of spring-boot-starter-data-mongodb dependency on vulnerable Log4J version

Is there a good cloud based reverse proxy for webhook development?

Why do backslashes appear twice?

How to send data from dialogfragment to fragment

How to use patch with (from django.views.generic import View)

Readable.from is not a function node 8.12.0 error in production

How to make a AllAppDrawer on Launcher 3 horizontal

Get previous value of current row in select

Cannot install .NET on Ubuntu 18.04

Generating graphs favouring unique cliques

Event using a generic type constraint to a delegate

Datatables Editor show/hide a checkbox of a list of checkbox in edit row when another column is true/false

Inconsistant behavior of aws cloudwatch rule

How to integrate openapi specifications from different APIs?

Search a group of SQL Server views for tables use

Scrapy - AttributeError: type object 'SettingsFrame' has no attribute 'ENABLE_CONNECT_PROTOCOL'

Exception while build and run flutter app

Top