Skip to content

Commit 0464fbd

Browse files
committed
Fixes for transcripts.
1 parent 2797b69 commit 0464fbd

17 files changed

+3746
-73
lines changed

transcripts/154-python-in-genomics.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -798,7 +798,7 @@
798798

799799
00:30:28 Looks really cool.
800800

801-
00:30:29 It definitely, they have it lined up to, when you go to visit spacey.io, it really looks appealing and polished.
801+
00:30:29 It definitely, they have it lined up to, when you go to visit spaCy.io, it really looks appealing and polished.
802802

803803
00:30:37 I was wondering why you didn't choose, what the difference or what made you choose spaCy over NLTK?
804804

transcripts/154-python-in-genomics.vtt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1204,7 +1204,7 @@ So I highly recommend it.
12041204
Looks really cool.
12051205

12061206
00:30:29.300 --> 00:30:37.160
1207-
It definitely, they have it lined up to, when you go to visit spacey.io, it really looks appealing and polished.
1207+
It definitely, they have it lined up to, when you go to visit spaCy.io, it really looks appealing and polished.
12081208

12091209
00:30:37.160 --> 00:30:42.440
12101210
I was wondering why you didn't choose, what the difference or what made you choose spaCy over NLTK?

transcripts/202-software-biz.txt

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1772,7 +1772,7 @@
17721772

17731773
00:53:11 also the more like data science focused stuff that I've been working on also I'd
17741774

1775-
00:53:16 always wanted like a spacey extension I haven't I haven't had time to look into
1775+
00:53:16 always wanted like a spaCy extension I haven't I haven't had time to look into
17761776

17771777
00:53:20 this but like we have an open issue on the tracker so if someone listens to this and
17781778

@@ -1782,11 +1782,11 @@
17821782

17831783
00:53:33 but some of the people you're like oh somebody they I found this and they should
17841784

1785-
00:53:37 know about it and I'll go ahead and throw spacey out there for you already so we can
1785+
00:53:37 know about it and I'll go ahead and throw spaCy out there for you already so we can
17861786

1787-
00:53:41 put spacey on the list right yeah of course one idea I've had just that we have to
1787+
00:53:41 put spaCy on the list right yeah of course one idea I've had just that we have to
17881788

1789-
00:53:45 mention like spacey wouldn't have been possible without siphon and lots of other
1789+
00:53:45 mention like spaCy wouldn't have been possible without siphon and lots of other
17901790

17911791
00:53:48 probably most sometimes people don't realize this but most Python packages you
17921792

@@ -1848,19 +1848,19 @@
18481848

18491849
00:55:55 chance for a final call to action for folks I think we already did that around
18501850

1851-
00:55:58 the building a successful software business so maybe just around spacey and
1851+
00:55:58 the building a successful software business so maybe just around spaCy and
18521852

18531853
00:56:02 prodigy tell people that are interested in NLP and what you guys are up to
18541854

18551855
00:56:07 where can they check it out if you don't know about space yet you should check it
18561856

1857-
00:56:10 out if you feel like spacey is not going to be useful to you then you don't like
1857+
00:56:10 out if you feel like spaCy is not going to be useful to you then you don't like
18581858

18591859
00:56:13 I'm not you know I don't want to go here and tell everyone they should use
18601860

18611861
00:56:16 our software we're building very specific maybe slightly niche developer product
18621862

1863-
00:56:20 developer tools but you know spacey I think we really we put a lot of effort
1863+
00:56:20 developer tools but you know spaCy I think we really we put a lot of effort
18641864

18651865
00:56:24 into our documentation we have a nice getting started guide if you're interested I'm hoping it should you know
18661866

transcripts/202-software-biz.vtt

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8296,7 +8296,7 @@ also the more like data science focused
82968296
stuff that I've been working on also I'd
82978297

82988298
00:53:16.040 --> 00:53:18.960
8299-
always wanted like a spacey extension I
8299+
always wanted like a spaCy extension I
83008300

83018301
00:53:18.960 --> 00:53:20.160
83028302
haven't I haven't had time to look into
@@ -8329,16 +8329,16 @@ somebody they I found this and they should
83298329
know about it and I'll go ahead and throw
83308330

83318331
00:53:38.840 --> 00:53:41.180
8332-
spacey out there for you already so we can
8332+
spaCy out there for you already so we can
83338333

83348334
00:53:41.180 --> 00:53:43.700
8335-
put spacey on the list right yeah of course
8335+
put spaCy on the list right yeah of course
83368336

83378337
00:53:43.700 --> 00:53:45.380
83388338
one idea I've had just that we have to
83398339

83408340
00:53:45.380 --> 00:53:46.620
8341-
mention like spacey wouldn't have been
8341+
mention like spaCy wouldn't have been
83428342

83438343
00:53:46.620 --> 00:53:48.980
83448344
possible without siphon and lots of other
@@ -8530,7 +8530,7 @@ folks I think we already did that around
85308530
the building a successful software
85318531

85328532
00:56:00.340 --> 00:56:02.980
8533-
business so maybe just around spacey and
8533+
business so maybe just around spaCy and
85348534

85358535
00:56:02.980 --> 00:56:04.500
85368536
prodigy tell people that are interested
@@ -8545,7 +8545,7 @@ where can they check it out if you don't
85458545
know about space yet you should check it
85468546

85478547
00:56:10.040 --> 00:56:11.460
8548-
out if you feel like spacey is not going
8548+
out if you feel like spaCy is not going
85498549

85508550
00:56:11.460 --> 00:56:13.460
85518551
to be useful to you then you don't like
@@ -8563,7 +8563,7 @@ our software we're building very specific
85638563
maybe slightly niche developer product
85648564

85658565
00:56:20.620 --> 00:56:23.040
8566-
developer tools but you know spacey I
8566+
developer tools but you know spaCy I
85678567

85688568
00:56:23.040 --> 00:56:24.260
85698569
think we really we put a lot of effort

transcripts/465-the-ai-revolution-wont-be-monopolized.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -266,13 +266,13 @@
266266

267267
00:11:05 is how do you teach these things? Information and how do you get them to know things and,
268268

269-
00:11:11 and so on. And, you know, for the spacey world, you have Prodigy and maybe give a shout out to
269+
00:11:11 and so on. And, you know, for the spaCy world, you have Prodigy and maybe give a shout out to
270270

271271
00:11:15 Prodigy Teams. That's something you just are just announcing, right?
272272

273273
00:11:19 Yeah. So that's currently in beta. It's something we've been working on. So the idea of Prodigy has
274274

275-
00:11:23 always been, hey, you know, support a spacey, also other libraries. And how can we, yeah,
275+
00:11:23 always been, hey, you know, support a spaCy, also other libraries. And how can we, yeah,
276276

277277
00:11:28 how can we make the training and data collection process more efficient or so efficient that companies
278278

@@ -928,9 +928,9 @@
928928

929929
00:30:22 So that's really models that we're trying to do one specific or some specific things.
930930

931-
00:30:27 It's kind of what we distribute for spacey.
931+
00:30:27 It's kind of what we distribute for spaCy.
932932

933-
00:30:30 There's also a lot of really cool community projects like size spacey for scientific biomedical
933+
00:30:30 There's also a lot of really cool community projects like size spaCy for scientific biomedical
934934

935935
00:30:37 techs.
936936

transcripts/465-the-ai-revolution-wont-be-monopolized.vtt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -403,7 +403,7 @@ give you a chance to give a shout out to the other thing that you all have is,
403403
is how do you teach these things? Information and how do you get them to know things and,
404404

405405
00:11:11.340 --> 00:11:15.880
406-
and so on. And, you know, for the spacey world, you have Prodigy and maybe give a shout out to
406+
and so on. And, you know, for the spaCy world, you have Prodigy and maybe give a shout out to
407407

408408
00:11:15.880 --> 00:11:19.360
409409
Prodigy Teams. That's something you just are just announcing, right?
@@ -412,7 +412,7 @@ Prodigy Teams. That's something you just are just announcing, right?
412412
Yeah. So that's currently in beta. It's something we've been working on. So the idea of Prodigy has
413413

414414
00:11:23.420 --> 00:11:28.820
415-
always been, hey, you know, support a spacey, also other libraries. And how can we, yeah,
415+
always been, hey, you know, support a spaCy, also other libraries. And how can we, yeah,
416416

417417
00:11:28.820 --> 00:11:34.240
418418
how can we make the training and data collection process more efficient or so efficient that companies
@@ -1396,10 +1396,10 @@ So one of them is what I've called task specific models.
13961396
So that's really models that we're trying to do one specific or some specific things.
13971397

13981398
00:30:27.680 --> 00:30:30.100
1399-
It's kind of what we distribute for spacey.
1399+
It's kind of what we distribute for spaCy.
14001400

14011401
00:30:30.100 --> 00:30:37.460
1402-
There's also a lot of really cool community projects like size spacey for scientific biomedical
1402+
There's also a lot of really cool community projects like size spaCy for scientific biomedical
14031403

14041404
00:30:37.460 --> 00:30:38.280
14051405
techs.

transcripts/477-spacy-nlp-v2.txt

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -798,7 +798,7 @@
798798

799799
00:19:53 Well, I'll link that as well.
800800

801-
00:19:56 Now, let's dive into the whole NLP and spacey side of things.
801+
00:19:56 Now, let's dive into the whole NLP and spaCy side of things.
802802

803803
00:20:00 I had Ines from Explosion on just back a couple months ago in June.
804804

@@ -808,13 +808,13 @@
808808

809809
00:20:13 So two to three months ago.
810810

811-
00:20:15 Anyway, we talked more about LLMs, not so much spacey, even though she's behind it.
811+
00:20:15 Anyway, we talked more about LLMs, not so much spaCy, even though she's behind it.
812812

813-
00:20:20 So give people a sense of what is spacey.
813+
00:20:20 So give people a sense of what is spaCy.
814814

815815
00:20:23 We just talked about Scikit-Learn and the types of problems it solves.
816816

817-
00:20:26 What about spacey?
817+
00:20:26 What about spaCy?
818818

819819
00:20:28 There's a couple of stories that could be told about it.
820820

@@ -834,9 +834,9 @@
834834

835835
00:20:56 And it was definitely kind of useful, but it wasn't necessarily a coherent pipeline.
836836

837-
00:20:59 And one way to, I think, historically describe spacey, it was like a very honest, good attempt to make a pipeline for all these different NLP components that kind of click together.
837+
00:20:59 And one way to, I think, historically describe spaCy, it was like a very honest, good attempt to make a pipeline for all these different NLP components that kind of click together.
838838

839-
00:21:09 And the first component inside of spacey that made it popular was basically a tokenizer.
839+
00:21:09 And the first component inside of spaCy that made it popular was basically a tokenizer.
840840

841841
00:21:15 Something I can take text and split it up into separate words.
842842

@@ -870,7 +870,7 @@
870870

871871
00:21:46 Because then if you like went back when I worked at the company, I used to work at Explosion just for context.
872872

873-
00:21:51 They would emphasize like the way you spell spacey is not with a capital S, it's with a capital C.
873+
00:21:51 They would emphasize like the way you spell spaCy is not with a capital S, it's with a capital C.
874874

875875
00:21:55 It's like when you go and put what is your ___location and your social media.
876876

@@ -932,7 +932,7 @@
932932

933933
00:23:01 This is going to happen.
934934

935-
00:23:02 But anyway, but back to spacey, I suppose.
935+
00:23:02 But anyway, but back to spaCy, I suppose.
936936

937937
00:23:04 Like this is sort of the origin story.
938938

@@ -1702,13 +1702,13 @@
17021702

17031703
00:41:11 This is the thing that people don't always recognize.
17041704

1705-
00:41:12 But the way that spacey is made, if you're from scikit-learn, this sounds a bit surprising
1705+
00:41:12 But the way that spaCy is made, if you're from scikit-learn, this sounds a bit surprising
17061706

17071707
00:41:17 because in scikit-learn land, you are typically used to the fact that you do batching and stuff
17081708

17091709
00:41:21 that's vectorized and numpy and that's sort of the way you would do it.
17101710

1711-
00:41:23 But spacey actually has a small preference to using generators.
1711+
00:41:23 But spaCy actually has a small preference to using generators.
17121712

17131713
00:41:27 And the whole thinking is that in natural language problems, you are typically dealing
17141714

@@ -1762,7 +1762,7 @@
17621762

17631763
00:42:46 that.
17641764

1765-
00:42:46 But my spacey habit would always be do the generator thing.
1765+
00:42:46 But my spaCy habit would always be do the generator thing.
17661766

17671767
00:42:49 Yeah.
17681768

@@ -1782,7 +1782,7 @@
17821782

17831783
00:43:12 nested data structures as well.
17841784

1785-
00:43:13 So that's the first thing that I usually end up doing when I'm doing something with spacey.
1785+
00:43:13 So that's the first thing that I usually end up doing when I'm doing something with spaCy.
17861786

17871787
00:43:17 Just get it into a generator.
17881788

@@ -2330,7 +2330,7 @@
23302330

23312331
00:57:17 A trick that I always like to use in terms of what examples should I annotate first?
23322332

2333-
00:57:22 At some point, you got to imagine I have some sort of spacey model.
2333+
00:57:22 At some point, you got to imagine I have some sort of spaCy model.
23342334

23352335
00:57:25 Maybe it has like 200 data points of labels.
23362336

@@ -2340,7 +2340,7 @@
23402340

23412341
00:57:31 When those two models disagree, something interesting is usually happening.
23422342

2343-
00:57:35 Because the LLM model is pretty good and the spacey model is pretty good.
2343+
00:57:35 Because the LLM model is pretty good and the spaCy model is pretty good.
23442344

23452345
00:57:38 But when they disagree, then I'm probably dealing with either a model that can be improved or data point that's just kind of tricky or something like that.
23462346

0 commit comments

Comments
 (0)